|
|
| Next: Please pull git390 'for-linus' branch |
| Author |
Message |
Daniel Walker External

Since: May 14, 2006 Posts: 391
|
Posted: Wed Sep 20, 2006 10:40 pm Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: linux>kernel (more info?) |
|
|
On Wed, 2006-09-20 at 22:14 +0200, Ingo Molnar wrote:
> > if (up->port.sysrq) {
> > /* serial8250_handle_port() already took the lock */
> > locked = 0;
In this case it had interrupts off in the !PREEMPT_RT case, but your
change leaves them on here.. _irqsave only runs in two of the three
cases..
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
K.R. Foley External

Since: May 26, 2006 Posts: 17
|
Posted: Wed Sep 20, 2006 10:40 pm Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Thomas Gleixner wrote:
> On Wed, 2006-09-20 at 13:56 -0500, K.R. Foley wrote:
>> EIP: 0060:[<c0130ad4>] Not tainted VLI
>> EFLAGS: 00010297 (2.6.17-rt8 #4)
>
> ----------------------^^^^^^^^^^^^^
>
> Are you sure, that kernel and .config are related ?
>
> tglx
Not at all. See my response to Ingo.
--
kr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Thomas Gleixner External

Since: May 14, 2006 Posts: 943
|
Posted: Wed Sep 20, 2006 10:50 pm Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wed, 2006-09-20 at 15:33 -0500, K.R. Foley wrote:
> DOH! The log had two different boots in it. Let's try this again. By
> the way, you may notice from my screw up that this is pretty much the
> same oops that I got with 2.6.17-rt*. I have been getting this on all of
> my SMP systems since we went past 2.6.16.
Which module is modprobed ?
tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Michal Piotrowski External

Since: May 15, 2006 Posts: 326
|
Posted: Wed Sep 20, 2006 11:00 pm Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Hi,
On 20/09/06, Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:
> I'm pleased to announce the 2.6.18-rt1 tree, which can be downloaded
> from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
2.6.18-rt3
Ingo, can you take a look at this bugs?
BUG: scheduling with irqs disabled: softirq-timer/1/0x00000001/17
caller is rt_spin_lock_slowlock+0x121/0x1af
[<c0104356>] show_trace_log_lvl+0x68/0x193
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c02edb34>] schedule+0x69/0xe3
[<c02ee8c0>] rt_spin_lock_slowlock+0x121/0x1af
[<c02eef4f>] rt_spin_lock+0x10/0x2c
[<c0171d6d>] __kmalloc+0xad/0x115
[<c020c146>] soft_cursor+0x52/0x16c
[<c020c00b>] bit_cursor+0x483/0x498
[<c02071ab>] fbcon_cursor+0x218/0x24d
[<c0244ac1>] hide_cursor+0x22/0x61
[<c024811f>] vt_console_print+0x91/0x207
[<c0120d31>] __call_console_drivers+0x6f/0x8c
[<c0120dac>] _call_console_drivers+0x5e/0x67
[<c0121367>] release_console_sem+0x132/0x1f2
[<c0121051>] vprintk+0x28d/0x2f5
[<c01210d3>] printk+0x1a/0x1c
[<c0129ab3>] run_timer_softirq+0x77b/0xbb6
[<c012632c>] ksoftirqd+0x11d/0x200
[<c013386b>] kthread+0xc7/0xf8
[<c0101005>] kernel_thread_helper+0x5/0xb
DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
Leftover inexact backtrace:
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c02edb34>] schedule+0x69/0xe3
[<c02ee8c0>] rt_spin_lock_slowlock+0x121/0x1af
[<c02eef4f>] rt_spin_lock+0x10/0x2c
[<c0171d6d>] __kmalloc+0xad/0x115
[<c020c146>] soft_cursor+0x52/0x16c
[<c020c00b>] bit_cursor+0x483/0x498
[<c02071ab>] fbcon_cursor+0x218/0x24d
[<c0244ac1>] hide_cursor+0x22/0x61
[<c024811f>] vt_console_print+0x91/0x207
[<c0120d31>] __call_console_drivers+0x6f/0x8c
[<c0120dac>] _call_console_drivers+0x5e/0x67
[<c0121367>] release_console_sem+0x132/0x1f2
[<c0121051>] vprintk+0x28d/0x2f5
[<c01210d3>] printk+0x1a/0x1c
[<c0129ab3>] run_timer_softirq+0x77b/0xbb6
[<c012632c>] ksoftirqd+0x11d/0x200
[<c013386b>] kthread+0xc7/0xf8
[<c0101005>] kernel_thread_helper+0x5/0xb
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
... [<c02ef447>] .... __spin_lock+0x12/0x35
......[<c0129387>] .. ( <= run_timer_softirq+0x4f/0xbb6)
skipping trace printing on CPU#1 != -1
BUG: scheduling while atomic: softirq-timer/1/0x00000001/17, CPU#1
[<c0104356>] show_trace_log_lvl+0x68/0x193
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c02ecc43>] __schedule+0x7d/0xded
[<c02edb8d>] schedule+0xc2/0xe3
[<c02ee8c0>] rt_spin_lock_slowlock+0x121/0x1af
[<c02eef4f>] rt_spin_lock+0x10/0x2c
[<c0171d6d>] __kmalloc+0xad/0x115
[<c020c146>] soft_cursor+0x52/0x16c
[<c020c00b>] bit_cursor+0x483/0x498
[<c02071ab>] fbcon_cursor+0x218/0x24d
[<c0244ac1>] hide_cursor+0x22/0x61
[<c024811f>] vt_console_print+0x91/0x207
[<c0120d31>] __call_console_drivers+0x6f/0x8c
[<c0120dac>] _call_console_drivers+0x5e/0x67
[<c0121367>] release_console_sem+0x132/0x1f2
[<c0121051>] vprintk+0x28d/0x2f5
[<c01210d3>] printk+0x1a/0x1c
[<c0129ab3>] run_timer_softirq+0x77b/0xbb6
[<c012632c>] ksoftirqd+0x11d/0x200
[<c013386b>] kthread+0xc7/0xf8
[<c0101005>] kernel_thread_helper+0x5/0xb
DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
Leftover inexact backtrace:
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c02ecc43>] __schedule+0x7d/0xded
[<c02edb8d>] schedule+0xc2/0xe3
[<c02ee8c0>] rt_spin_lock_slowlock+0x121/0x1af
[<c02eef4f>] rt_spin_lock+0x10/0x2c
[<c0171d6d>] __kmalloc+0xad/0x115
[<c020c146>] soft_cursor+0x52/0x16c
[<c020c00b>] bit_cursor+0x483/0x498
[<c02071ab>] fbcon_cursor+0x218/0x24d
[<c0244ac1>] hide_cursor+0x22/0x61
[<c024811f>] vt_console_print+0x91/0x207
[<c0120d31>] __call_console_drivers+0x6f/0x8c
[<c0120dac>] _call_console_drivers+0x5e/0x67
[<c0121367>] release_console_sem+0x132/0x1f2
[<c0121051>] vprintk+0x28d/0x2f5
[<c01210d3>] printk+0x1a/0x1c
[<c0129ab3>] run_timer_softirq+0x77b/0xbb6
[<c012632c>] ksoftirqd+0x11d/0x200
[<c013386b>] kthread+0xc7/0xf8
[<c0101005>] kernel_thread_helper+0x5/0xb
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
... [<c02ef447>] .... __spin_lock+0x12/0x35
......[<c0129387>] .. ( <= run_timer_softirq+0x4f/0xbb6)
(gdb) l *rt_spin_lock_slowlock+0x121/0x1af
0xc02ee79f is in rt_spin_lock_slowlock (/usr/src/linux-rt/kernel/rtmutex.c:660).
655 * enables the seemless use of arbitrary (blocking) spinlocks within
656 * sleep/wakeup event loops.
657 */
658 static void fastcall noinline __sched
659 rt_spin_lock_slowlock(struct rt_mutex *lock)
660 {
661 struct rt_mutex_waiter waiter;
662 unsigned long saved_state, state;
663
664 debug_rt_mutex_init_waiter(&waiter);
l *__spin_lock+0x12/0x35
0xc02ef435 is in __spin_lock (/usr/src/linux-rt/kernel/spinlock.c:222).
217 _raw_write_lock(lock);
218 }
219 EXPORT_SYMBOL(__write_lock_bh);
220
221 void __lockfunc __spin_lock(raw_spinlock_t *lock)
222 {
223 preempt_disable();
224 spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
225 _raw_spin_lock(lock);
226 }
l *0xc0129387
0xc0129387 is in run_timer_softirq
(/usr/src/linux-rt/include/linux/seqlock.h:148).
143 }
144
145 static __always_inline void __write_seqlock_raw(raw_seqlock_t *sl)
146 {
147 spin_lock(&sl->lock);
148 ++sl->sequence;
149 smp_wmb();
150 }
151
152 static __always_inline void __write_sequnlock_raw(raw_seqlock_t *sl)
SELinux stuff.
=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
init/1 is trying to acquire lock:
(policy_rwlock){--..}, at: [<c01df7ab>] security_genfs_sid+0x1f/0xeb
but task is already holding lock:
(policy_rwlock){--..}, at: [<c01df88c>] security_fs_use+0x15/0xbb
other info that might help us debug this:
3 locks held by init/1:
#0: (sel_mutex){--..}, at: [<c01da736>] sel_write_load+0x1b/0x2c0
#1: (&type->s_umount_key#14){--..}, at: [<c01d6de7>]
selinux_complete_init+0x67/0xbe
#2: (policy_rwlock){--..}, at: [<c01df88c>] security_fs_use+0x15/0xbb
stack backtrace:
[<c0104356>] show_trace_log_lvl+0x68/0x193
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c013b795>] __lock_acquire+0x788/0x9be
[<c013bca4>] lock_acquire+0x55/0x72
[<c02ef13d>] rt_read_lock+0x1f/0x5f
[<c01df7ab>] security_genfs_sid+0x1f/0xeb
[<c01df913>] security_fs_use+0x9c/0xbb
[<c01d667e>] superblock_doinit+0xb5/0x6b5
[<c01d6df5>] selinux_complete_init+0x75/0xbe
[<c01e0ec5>] security_load_policy+0xc9/0x275
[<c01da7c1>] sel_write_load+0xa6/0x2c0
[<c0176411>] vfs_write+0xaf/0x153
[<c0176a71>] sys_write+0x3f/0x66
[<c0103216>] sysenter_past_esp+0x63/0xa1
DWARF2 unwinder stuck at sysenter_past_esp+0x63/0xa1
Leftover inexact backtrace:
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c013b795>] __lock_acquire+0x788/0x9be
[<c013bca4>] lock_acquire+0x55/0x72
[<c02ef13d>] rt_read_lock+0x1f/0x5f
[<c01df7ab>] security_genfs_sid+0x1f/0xeb
[<c01df913>] security_fs_use+0x9c/0xbb
[<c01d667e>] superblock_doinit+0xb5/0x6b5
[<c01d6df5>] selinux_complete_init+0x75/0xbe
[<c01e0ec5>] security_load_policy+0xc9/0x275
[<c01da7c1>] sel_write_load+0xa6/0x2c0
[<c0176411>] vfs_write+0xaf/0x153
[<c0176a71>] sys_write+0x3f/0x66
[<c0103216>] sysenter_past_esp+0x63/0xa1
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
skipping trace printing on CPU#1 != -1
l *security_genfs_sid+0x1f/0xeb
0xc01df78c is in security_genfs_sid
(/usr/src/linux-rt/security/selinux/ss/services.c:1612).
1607 */
1608 int security_genfs_sid(const char *fstype,
1609 char *path,
1610 u16 sclass,
1611 u32 *sid)
1612 {
1613 int len;
1614 struct genfs *genfs;
1615 struct ocontext *c;
1616 int rc = 0, cmp = 0;
l *security_fs_use+0x15/0xbb
0xc01df877 is in security_fs_use
(/usr/src/linux-rt/security/selinux/ss/services.c:1669).
1664 */
1665 int security_fs_use(
1666 const char *fstype,
1667 unsigned int *behavior,
1668 u32 *sid)
1669 {
1670 int rc = 0;
1671 struct ocontext *c;
1672
1673 POLICY_RDLOCK;
config & dmesg http://www.stardust.webpages.pl/files/o_bugs/rt/2.6.18-rt1/
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
K.R. Foley External

Since: May 26, 2006 Posts: 17
|
Posted: Wed Sep 20, 2006 11:00 pm Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Thomas Gleixner wrote:
> On Wed, 2006-09-20 at 15:33 -0500, K.R. Foley wrote:
>> DOH! The log had two different boots in it. Let's try this again. By
>> the way, you may notice from my screw up that this is pretty much the
>> same oops that I got with 2.6.17-rt*. I have been getting this on all of
>> my SMP systems since we went past 2.6.16.
>
> Which module is modprobed ?
>
> tglx
>
>
>
How can I tell which particular module is being loaded? The last thing I
see on the console before the oops is that it is starting udev. I am
including the rest of the boot log below in hopes that will help.
Suggestions? Something else I can provide?
Linux version 2.6.18-rt2 (aaektkf@krfc3) (gcc version 3.4.4 20050721
(Red Hat 3.4.4-2)) #4 SMP PREEMPT Wed Sep 20 14:53:58 CDT 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001ff70000 (usable)
BIOS-e820: 000000001ff70000 - 000000001ff77000 (ACPI data)
BIOS-e820: 000000001ff77000 - 000000001ff80000 (ACPI NVS)
BIOS-e820: 000000001ff80000 - 0000000020000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)
BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
511MB LOWMEM available.
found SMP MP-table at 000f6b00
DMI present.
ACPI: PM-Timer IO Port: 0x1008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled)
Processor #6 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7 15:2 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfec80000] gsi_base[24])
IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47
ACPI: IOAPIC (id[0x04] address[0xfec80100] gsi_base[48])
IOAPIC[2]: apic_id 4, version 32, address 0xfec80100, GSI 48-71
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Enabling APIC mode: Flat. Using 3 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000)
Detected 2591.683 MHz processor.
Real-Time Preemption Support (C) 2004-2006 Ingo Molnar
Built 1 zonelists. Total pages: 130928
Kernel command line: ro root=LABEL=/ console=ttyS0,38400 console=tty0
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
WARNING: experimental RCU implementation.
Event source pit configured with caps set: 03
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 509756k/523712k available (1753k kernel code, 13568k reserved,
1407k data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 5185.26 BogoMIPS
(lpj=2592634)
Security Framework v1.0.0 initialized
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
Compat vDSO mapped to ffffe000.
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 12k freed
ACPI: Core revision 20060707
CPU0: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Booting processor 1/1 eip 2000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5182.45 BogoMIPS
(lpj=2591226)
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Booting processor 2/6 eip 2000
Initializing CPU#2
Calibrating delay using timer specific routine.. 5182.51 BogoMIPS
(lpj=2591258)
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 3
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#2.
CPU2: Intel P4/Xeon Extended MCE MSRs (12) available
CPU2: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Booting processor 3/7 eip 2000
Initializing CPU#3
Calibrating delay using timer specific routine.. 5182.53 BogoMIPS
(lpj=2591268)
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 3
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#3.
CPU3: Intel P4/Xeon Extended MCE MSRs (12) available
CPU3: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Total of 4 processors activated (20732.77 BogoMIPS).
ENABLING IO-APIC IRQs
...TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
lapic max_delta_ns: 1346568721
Event source pit new caps set: 01
Event source lapic configured with caps set: 02
checking TSC synchronization across 4 CPUs: passed.
Event source pit new caps set: 01
Event source lapic configured with caps set: 02
Event source pit new caps set: 01
Event source lapic configured with caps set: 02
Event source pit new caps set: 01
Event source lapic configured with caps set: 02
Brought up 4 CPUs
checking if image is initramfs... it is
Freeing initrd memory: 295k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfd915, last bus=5
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
* The chipset may have PM-Timer Bug. Due to workarounds for a bug,
* this clock source is slow. If you are sure your timer does not have
* this bug, please use "acpi_pm_good" to disable the workaround
PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 1180-11bf claimed by ICH4 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 10 11 14 15) *5
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 10 *11 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnPACPI: METHOD_NAME__CRS failure for PNP0401
pnp: PnP ACPI: found 12 devices
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a
report
PCI: Bridge: 0000:00:01.0
IO window: disabled.
MEM window: e1000000-e1ffffff
PREFETCH window: ec000000-f7ffffff
PCI: Bridge: 0000:02:1d.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:02:1f.0
IO window: disabled.
MEM window: e2100000-e21fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:02.0
IO window: disabled.
MEM window: e2000000-e21fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
IO window: 2000-2fff
MEM window: e2200000-e22fffff
PREFETCH window: 30000000-300fffff
NET: Registered protocol family 2
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 7, 655360 bytes)
TCP bind hash table entries: 8192 (order: 6, 294912 bytes)
TCP: Hash tables configured (established 16384 bind 8192)
TCP reno registered
Simple Boot Flag at 0x36 set to 0x1
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ibm_acpi: ec object not found
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Real Time Clock Driver v1.12ac
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH4: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 16
ICH4: chipset revision 2
ICH4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x1460-0x1467, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0x1468-0x146f, BIOS settings: hdc:DMA, hdd:pio
hda: WDC WD800BB-75CAA0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: SONY DVD RW DW-U18A, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: Host Protected Area detected.
current capacity is 156250000 sectors (80000 MB)
native capacity is 156301488 sectors (80026 MB)
hda: Host Protected Area disabled.
hda: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63,
UDMA(100)
hda: cache flushes not supported
hda: hda1 hda2 < hda5 hda6 hda7 >
PNP: PS/2 Controller [PNP0303:KBC,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 8
NET: Registered protocol family 20
Starting balanced_irq
Using IPI Shortcut mode
Freeing unused kernel memory: 196k freed
Time: tsc clocksource has been installed.
input: AT Translated Set 2 keyboard as /class/input/input0
input: ImExPS/2 Generic Explorer Mouse as /class/input/input1
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
BUG: unable to handle kernel paging request at virtual address f3010000
printing eip:
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 1
EIP: 0060:[<c0131e02>] Not tainted VLI
EFLAGS: 00010283 (2.6.18-rt2 #4)
EIP is at lookup_symbol+0x11/0x35
eax: 00000001 ebx: e0830e08 ecx: c036ff60 edx: c036dd94
esi: f3010000 edi: e0830e08 ebp: df657e74 esp: df657e68
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process modprobe (pid: 1366, ti=df656000 task=dfc68e90 task.ti=df656000)
Stack: e083c780 00000c00 e0830e08 df657e90 c0131e6f df657ea8 df657ea4
e083c780
00000c00 e0830e08 df657eb8 c0132c21 00000001 00000012 e082d074
00000000
df657ecc e083a434 00000c00 e082d074 df657edc c0133188 e083c780
00000000
Call Trace:
[<c01037a1>] show_stack_log_lvl+0x87/0x8f
[<c010391b>] show_registers+0x12f/0x198
[<c0103b0c>] die+0x114/0x1c6
[<c0111196>] do_page_fault+0x3f2/0x4c8
[<c0103481>] error_code+0x39/0x40
[<c0131e6f>] __find_symbol+0x25/0x2a5
[<c0132c21>] resolve_symbol+0x27/0x5f
[<c0133188>] simplify_symbols+0x83/0xf3
[<c0133e65>] load_module+0x720/0xbb8
[<c013435f>] sys_init_module+0x3f/0x1b5
[<c0102969>] sysenter_past_esp+0x56/0x79
Code: eb 11 8b 75 f0 41 83 c2 28 0f b7 46 30 39 c1 72 c9 31 c0 5a 59 5b
5e 5f 5d c3 55 89 e5 57 56 53 89 c3 39 ca 73 22 8b 72 04 89 df <ac> ae
75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 75 04 89
EIP: [<c0131e02>] lookup_symbol+0x11/0x35 SS:ESP 0068:df657e68
--
kr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Michal Piotrowski External

Since: May 15, 2006 Posts: 326
|
Posted: Thu Sep 21, 2006 12:10 am Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 20/09/06, Ingo Molnar <mingo RemoveThis @elte.hu> wrote:
> I'm pleased to announce the 2.6.18-rt1 tree, which can be downloaded
> from the usual place:
>
sudo /etc/init.d/iptables stop
ip_conntrack issue?
BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/20112
caller is drain_array+0x19/0xe6
[<c0104356>] show_trace_log_lvl+0x68/0x193
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c01f7c3f>] debug_smp_processor_id+0x7f/0x90
[<c01711ea>] drain_array+0x19/0xe6
[<c0171442>] __cache_shrink+0x3b/0x7c
[<c0172e08>] kmem_cache_destroy+0x80/0x145
[<fd9423a8>] ip_conntrack_cleanup+0x88/0xbc [ip_conntrack]
[<fd94540c>] ip_conntrack_standalone_fini+0x5c/0x8b [ip_conntrack]
[<c0143e6f>] sys_delete_module+0x195/0x1be
[<c0103216>] sysenter_past_esp+0x63/0xa1
DWARF2 unwinder stuck at sysenter_past_esp+0x63/0xa1
Leftover inexact backtrace:
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c01f7c3f>] debug_smp_processor_id+0x7f/0x90
[<c01711ea>] drain_array+0x19/0xe6
[<c0171442>] __cache_shrink+0x3b/0x7c
[<c0172e08>] kmem_cache_destroy+0x80/0x145
[<fd9423a8>] ip_conntrack_cleanup+0x88/0xbc [ip_conntrack]
[<fd94540c>] ip_conntrack_standalone_fini+0x5c/0x8b [ip_conntrack]
[<c0143e6f>] sys_delete_module+0x195/0x1be
[<c0103216>] sysenter_past_esp+0x63/0xa1
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
... [<c01f7bfe>] .... debug_smp_processor_id+0x3e/0x90
......[<c01711ea>] .. ( <= drain_array+0x19/0xe6)
skipping trace printing on CPU#0 != -1
BUG: modprobe:20112 task might have lost a preemption check!
[<c0104356>] show_trace_log_lvl+0x68/0x193
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c011cd8a>] preempt_enable_no_resched+0x48/0x4d
[<c01f7c47>] debug_smp_processor_id+0x87/0x90
[<c01711ea>] drain_array+0x19/0xe6
[<c0171442>] __cache_shrink+0x3b/0x7c
[<c0172e08>] kmem_cache_destroy+0x80/0x145
[<fd9423a8>] ip_conntrack_cleanup+0x88/0xbc [ip_conntrack]
[<fd94540c>] ip_conntrack_standalone_fini+0x5c/0x8b [ip_conntrack]
[<c0143e6f>] sys_delete_module+0x195/0x1be
[<c0103216>] sysenter_past_esp+0x63/0xa1
DWARF2 unwinder stuck at sysenter_past_esp+0x63/0xa1
Leftover inexact backtrace:
[<c0104a5a>] show_trace+0x1b/0x20
[<c0104b38>] dump_stack+0x1f/0x24
[<c011cd8a>] preempt_enable_no_resched+0x48/0x4d
[<c01f7c47>] debug_smp_processor_id+0x87/0x90
[<c01711ea>] drain_array+0x19/0xe6
[<c0171442>] __cache_shrink+0x3b/0x7c
[<c0172e08>] kmem_cache_destroy+0x80/0x145
[<fd9423a8>] ip_conntrack_cleanup+0x88/0xbc [ip_conntrack]
[<fd94540c>] ip_conntrack_standalone_fini+0x5c/0x8b [ip_conntrack]
[<c0143e6f>] sys_delete_module+0x195/0x1be
[<c0103216>] sysenter_past_esp+0x63/0xa1
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
l *0xc01f7bfe
0xc01f7bfe is in debug_smp_processor_id
(/usr/src/linux-rt/lib/smp_processor_id.c:42).
37 /*
38 * Avoid recursion:
39 */
40 preempt_disable();
41
42 if (!printk_ratelimit())
43 goto out_enable;
44
45 printk(KERN_ERR "BUG: using smp_processor_id() in
preemptible [%08x] code: %s/%d\n", preempt_count()-1, current->comm,
current->pid);
46 print_symbol("caller is %s\n",
(long)__builtin_return_address(0));
l *0xc01711ea
0xc01711ea is in drain_array (/usr/src/linux-rt/mm/slab.c:3852).
3847 struct array_cache *ac, int force, int node)
3848 {
3849 int this_cpu = smp_processor_id();
3850 int tofree;
3851
3852 if (!ac || !ac->avail)
3853 return;
3854 if (ac->touched && !force) {
3855 ac->touched = 0;
3856 } else {
http://www.stardust.webpages.pl/files/o_bugs/rt/2.6.18-rt1/rt-config
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Michal Piotrowski External

Since: May 15, 2006 Posts: 326
|
Posted: Thu Sep 21, 2006 12:30 am Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On 20/09/06, Ingo Molnar <mingo.RemoveThis@elte.hu> wrote:
> I'm pleased to announce the 2.6.18-rt1 tree, which can be downloaded
> from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
Hibernation doesn't work for me.
echo shutdown > /sys/power/disk; echo disk > /sys/power/state
Freezing cpus...
Breaking affinity for irq 14
Breaking affinity for irq 15
Breaking affinity for irq 19
Breaking affinity for irq 21
Any ideas why?
http://www.stardust.webpages.pl/files/o_bugs/rt/2.6.18-rt1/rt-config
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 9:10 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh DeleteThis @gnuppy.monkey.org> wrote:
> On Wed, Sep 20, 2006 at 04:19:07PM +0200, Ingo Molnar wrote:
> > I'm pleased to announce the 2.6.18-rt1 tree, which can be downloaded
> > from the usual place:
> ...
> > as usual, bugreports, fixes and suggestions are welcome,
>
> Speaking of which...
>
> This patch moves put_task_struct() reaping into a thread instead of an
> RCU callback function [...]
had some time to think about it since yesterday: RCU reaping is done in
softirqs (check out the softirq-rcu threads on your -rt box), that's why
i removed the delayed-task-drop code to begin with. Now i dont doubt
that you saw crashes under 2.6.17 - but did you manage to figure out
what the reason is for those crashes, and do those reasons really
necessiate the pushing of task-reapdown into yet another set of kernel
threads?
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 9:20 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 08:54:02AM +0200, Ingo Molnar wrote:
> * Bill Huey <billh.DeleteThis@gnuppy.monkey.org> wrote:
>
> > On Wed, Sep 20, 2006 at 04:19:07PM +0200, Ingo Molnar wrote:
> > > I'm pleased to announce the 2.6.18-rt1 tree, which can be downloaded
> > > from the usual place:
> > ...
> > > as usual, bugreports, fixes and suggestions are welcome,
> >
> > Speaking of which...
> >
> > This patch moves put_task_struct() reaping into a thread instead of an
> > RCU callback function [...]
>
> had some time to think about it since yesterday: RCU reaping is done in
> softirqs (check out the softirq-rcu threads on your -rt box), that's why
> i removed the delayed-task-drop code to begin with. Now i dont doubt
It's correct from the standpoint of it being reaped in another thread,
so it fixed those crashes. But I pushed it down into another thread at the
request of Esben and his private discussion with Paul McKenney, since
a summary from Esben felt that call_rcu() was somehow less than ideal to
do that.
> that you saw crashes under 2.6.17 - but did you manage to figure out
> what the reason is for those crashes, and do those reasons really
> necessiate the pushing of task-reapdown into yet another set of kernel
> threads?
Unfortunately no. I even used Robert's .config on my machine. I added a
disk controller and networking device driver just to boot into his
configuration and I still couldn't replicated any of his kjournald problems
at all. If I had his hardware I'd have a better way of replicating those
problems and pound it out.
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 9:30 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh DeleteThis @gnuppy.monkey.org> wrote:
> > > This patch moves put_task_struct() reaping into a thread instead
> > > of an RCU callback function [...]
> >
> > had some time to think about it since yesterday: RCU reaping is done
> > in softirqs (check out the softirq-rcu threads on your -rt box),
> > that's why i removed the delayed-task-drop code to begin with. Now i
> > dont doubt
>
> It's correct from the standpoint of it being reaped in another thread,
> so it fixed those crashes. But I pushed it down into another thread at
> the request of Esben and his private discussion with Paul McKenney,
> since a summary from Esben felt that call_rcu() was somehow less than
> ideal to do that.
but it _is_ already being reaped in another thread: softirq-rcu.
Splitting that up any further will only fragment the context-switching
and increases cache footprint - it wont (or rather, shouldnt) have any
functional effect. (As a sidenote, i'm considering the unification of
all 'same default priority' softirq threads into a single thread per
CPU, to further reduce this cost of 'spreadout'.)
> > that you saw crashes under 2.6.17 - but did you manage to figure out
> > what the reason is for those crashes, and do those reasons really
> > necessiate the pushing of task-reapdown into yet another set of
> > kernel threads?
>
> Unfortunately no. I even used Robert's .config on my machine. I added
> a disk controller and networking device driver just to boot into his
> configuration and I still couldn't replicated any of his kjournald
> problems at all. If I had his hardware I'd have a better way of
> replicating those problems and pound it out.
ok, then i guess what we have left is to wait and see whether it still
triggers with the current 2.6.18-rt codebase - maybe it triggers for
someone in a scenario that is easier to debug.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 9:30 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 12:18:40AM -0700, Bill Huey wrote:
> On Thu, Sep 21, 2006 at 08:54:02AM +0200, Ingo Molnar wrote:
> > that you saw crashes under 2.6.17 - but did you manage to figure out
> > what the reason is for those crashes, and do those reasons really
> > necessiate the pushing of task-reapdown into yet another set of kernel
> > threads?
>
> Unfortunately no. I even used Robert's .config on my machine. I added a
> disk controller and networking device driver just to boot into his
> configuration and I still couldn't replicated any of his kjournald problems
> at all. If I had his hardware I'd have a better way of replicating those
> problems and pound it out.
Robert's stack traces looked completely wrong as well which is why I gave up.
Symbols showing up in this stack traces should have been completely compiled
out.
Also, triggering a panic() at the beginning of the rt mutex acquire was
very useful since it made "in_atomic()" violations an explicit error stopping
the machine. Stack traces started to get really crazy in this preemptive
kernel with all sorts of things running unlike the non-preemptive kernel and
it was time consuming to figure out the real stuff from the noise in the
stack trace.
It made the stack traces smaller and more immediately local to the problem
logic. Then I discovered panic() didn't work correctly in -rt so I fixed that
as well. There were a lot of little breakdowns in 2.6.17-rt...
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 9:40 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh DeleteThis @gnuppy.monkey.org> wrote:
> > but it _is_ already being reaped in another thread: softirq-rcu.
> > Splitting that up any further will only fragment the
> > context-switching and increases cache footprint - it wont (or
> > rather, shouldnt) have any functional effect. (As a sidenote, i'm
> > considering the unification of all 'same default priority' softirq
> > threads into a single thread per CPU, to further reduce this cost of
> > 'spreadout'.)
>
> I overloaded another reaping thread that was doing largely similar
> functionality in that it was also reaping, so I don't think it's that
> bad. I did it from a cleanliness point of view with the code tree.
> It's the "desched_thread" in fork.c that I'm using. It seems to be the
> right thing to do. I'm sure Esben will follow up on this.
the reason why i added desched_thread was not because it's "more right"
to do this from a separate context, but simply because the resource
freed by it is not being freed via RCU by the upstream kernel. If that
resource (mm_struct) were freed by RCU we'd have its rt-friendly
reapdown "for free" and no desched_thread would be needed at all.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 9:40 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh DeleteThis @gnuppy.monkey.org> wrote:
> Also, triggering a panic() at the beginning of the rt mutex acquire
> was very useful since it made "in_atomic()" violations an explicit
> error stopping the machine. Stack traces started to get really crazy
> in this preemptive kernel with all sorts of things running unlike the
> non-preemptive kernel and it was time consuming to figure out the real
> stuff from the noise in the stack trace.
well you should absolutely have serial console if you effectively want
to hack the Linux kernel. And in the serial console log you should
search for stacktraces top-down, and concentrate on the first one - any
subsequent one might be collateral damage of the first one.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 9:40 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 09:16:24AM +0200, Ingo Molnar wrote:
> * Bill Huey <billh RemoveThis @gnuppy.monkey.org> wrote:
> > It's correct from the standpoint of it being reaped in another thread,
> > so it fixed those crashes. But I pushed it down into another thread at
> > the request of Esben and his private discussion with Paul McKenney,
> > since a summary from Esben felt that call_rcu() was somehow less than
> > ideal to do that.
>
> but it _is_ already being reaped in another thread: softirq-rcu.
> Splitting that up any further will only fragment the context-switching
> and increases cache footprint - it wont (or rather, shouldnt) have any
> functional effect. (As a sidenote, i'm considering the unification of
> all 'same default priority' softirq threads into a single thread per
> CPU, to further reduce this cost of 'spreadout'.)
I overloaded another reaping thread that was doing largely similar
functionality in that it was also reaping, so I don't think it's that bad.
I did it from a cleanliness point of view with the code tree. It's the
"desched_thread" in fork.c that I'm using. It seems to be the right
thing to do. I'm sure Esben will follow up on this.
> > > that you saw crashes under 2.6.17 - but did you manage to figure out
> > > what the reason is for those crashes, and do those reasons really
> > > necessiate the pushing of task-reapdown into yet another set of
> > > kernel threads?
> >
> > Unfortunately no. I even used Robert's .config on my machine. I added
> > a disk controller and networking device driver just to boot into his
> > configuration and I still couldn't replicated any of his kjournald
> > problems at all. If I had his hardware I'd have a better way of
> > replicating those problems and pound it out.
>
> ok, then i guess what we have left is to wait and see whether it still
> triggers with the current 2.6.18-rt codebase - maybe it triggers for
> someone in a scenario that is easier to debug.
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 9:40 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 09:22:16AM +0200, Ingo Molnar wrote:
> * Bill Huey <billh.DeleteThis@gnuppy.monkey.org> wrote:
>
> > Also, triggering a panic() at the beginning of the rt mutex acquire
> > was very useful since it made "in_atomic()" violations an explicit
> > error stopping the machine. Stack traces started to get really crazy
> > in this preemptive kernel with all sorts of things running unlike the
> > non-preemptive kernel and it was time consuming to figure out the real
> > stuff from the noise in the stack trace.
>
> well you should absolutely have serial console if you effectively want
> to hack the Linux kernel. And in the serial console log you should
> search for stacktraces top-down, and concentrate on the first one - any
> subsequent one might be collateral damage of the first one.
Of course I did that. I'm not that stupid. The stack traces, even with
your above suggestions were too many and I had to break it down a bug at
a time, stack trace at a time, since I realize problems earlier could
clash and trigger other unrelated problems.
It was even problematic with the serial console on which is why I did
that. Maybe it was an artifact of having both the serial console and video
consoles on ?
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 9:50 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh.DeleteThis@gnuppy.monkey.org> wrote:
> On Thu, Sep 21, 2006 at 09:22:16AM +0200, Ingo Molnar wrote:
> > * Bill Huey <billh.DeleteThis@gnuppy.monkey.org> wrote:
> >
> > > Also, triggering a panic() at the beginning of the rt mutex acquire
> > > was very useful since it made "in_atomic()" violations an explicit
> > > error stopping the machine. Stack traces started to get really crazy
> > > in this preemptive kernel with all sorts of things running unlike the
> > > non-preemptive kernel and it was time consuming to figure out the real
> > > stuff from the noise in the stack trace.
> >
> > well you should absolutely have serial console if you effectively want
> > to hack the Linux kernel. And in the serial console log you should
> > search for stacktraces top-down, and concentrate on the first one - any
> > subsequent one might be collateral damage of the first one.
>
> Of course I did that. I'm not that stupid. The stack traces, even
> with your above suggestions were too many and I had to break it down a
> bug at a time, stack trace at a time, since I realize problems earlier
> could clash and trigger other unrelated problems.
>
> It was even problematic with the serial console on which is why I did
> that. Maybe it was an artifact of having both the serial console and
> video consoles on ?
perhaps the real problem was that you got 'intermixed' stackdumps from
multiple CPUs crashing at once? Or was it simply the myriads of
stackdumps? The myriads effect is easy to solve: only look at the first
one, and fix them one by one
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 9:50 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 09:29:08AM +0200, Ingo Molnar wrote:
> * Bill Huey <billh.DeleteThis@gnuppy.monkey.org> wrote:
> > I overloaded another reaping thread that was doing largely similar
> > functionality in that it was also reaping, so I don't think it's that
> > bad. I did it from a cleanliness point of view with the code tree.
> > It's the "desched_thread" in fork.c that I'm using. It seems to be the
> > right thing to do. I'm sure Esben will follow up on this.
>
> the reason why i added desched_thread was not because it's "more right"
> to do this from a separate context, but simply because the resource
I only did that because I saw it there and I assumed it the was the correct
thing to use and that's why I used it.
> freed by it is not being freed via RCU by the upstream kernel. If that
> resource (mm_struct) were freed by RCU we'd have its rt-friendly
> reapdown "for free" and no desched_thread would be needed at all.
Well, it's difficult to say. I can't say which is the best method. If the
upstream kernel used RCU function in a task allocation or task struct reading
in the first place then call_rcu() would be a clear choice. However, I didn't
see it used in that way (I could be wrong) so I use the next closest thing that
seems reasonable which is the thread desched_thread(). It use it to avoid
overloading the sematics of call_rcu() to be anything other than a pure RCU
callback. I suggest talking to Esben an Paul about this to get their view on
the matter.
Either method, call_rcu or desched_thread does the trick outside of the
scheduler path and fixes the problem. It's your choice.
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Bill Huey External

Since: Jul 24, 2006 Posts: 84
|
Posted: Thu Sep 21, 2006 10:00 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Thu, Sep 21, 2006 at 09:31:30AM +0200, Ingo Molnar wrote:
> * Bill Huey <billh.RemoveThis@gnuppy.monkey.org> wrote:
> > It was even problematic with the serial console on which is why I did
> > that. Maybe it was an artifact of having both the serial console and
> > video consoles on ?
>
> perhaps the real problem was that you got 'intermixed' stackdumps from
> multiple CPUs crashing at once? Or was it simply the myriads of
> stackdumps? The myriads effect is easy to solve: only look at the first
> one, and fix them one by one
I don't think I have to tell you that things got "really weird" for a
while which is why I took the route of most severity and elected to use
extreme debugging methods.
I mean, some of those stack traces kept triggering a jumble of schedule()
calls, etc... I decided to hack the heads off of them one at at time and
stop the kernel immediately after one of those bugs. The immediate panic()
is what caught the tbl raw_spinlock issue and therefore my lock reversion
after auditing that portion of the lock graph.
bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Deepak Saxena External

Since: May 16, 2006 Posts: 29
|
Posted: Thu Sep 21, 2006 10:10 am Post subject: Re: 2.6.18-rt1 [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Sep 20 2006, at 21:46, Ingo Molnar was caught saying:
>
> * Gene Heskett <gene.heskett.DeleteThis@verizon.net> wrote:
>
> > That looks like the chorus of the song I saw when it crashed on boot,
> > pretty darned close to identical.
>
> ok, i've uploaded -rt3:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this should have this one fixed.
I am seeing an intermittent lock up on the ARM Versatile board during the
ALSA driver init that only shows up with (PREEMPT_RT & !HIGH_RES_TIMERS
& ARM_EABI) enabled. If HRT is disabled and EABI is enabled, the kernel
works every time, and same with !RT & !HRT & EABI. I get no oops, just
a complete lock up with no console output.
In summary:
PREEMPT HRT EABI BOOTS
------------------------------------------
2.6.18-rt3
------------------------------------------
RT Y Y Y
RT N Y Intermittent
RT N N Y
NONE Y Y Y
NONE N Y Y
NONE N N Y
------------------------------------------
2.6.18-vanilla
------------------------------------------
N/A Y Y
------------------------------------------
I need to go pinpoint the exact point where it is locking up during
the ALSA driver init (calls to udelay() seem suspect to me) and it
is very possible that this is a toolchain issue but want to see if
any other ARM folks are seeing issues with EABI & !HRT.
~Deepak
--
Deepak Saxena - dsaxena.DeleteThis@plexity.net - http://www.plexity.net
"An open heart has no possessions, only experiences" - Matt Bibbeau
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Thu Sep 21, 2006 10:10 am Post subject: Re: [PATCH] move put_task_struct() reaping into a thread [Re: 2.6.18-rt1] [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Bill Huey <billh.RemoveThis@gnuppy.monkey.org> wrote:
> [...] If the upstream kernel used RCU function in a task allocation or
> task struct reading in the first place then call_rcu() would be a
> clear choice. However, I didn't see it used in that way (I could be
> wrong) [...]
it was RCU-ified briefly but then it was further improved to direct
freeing, because upstream _can_ free it directly.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
|
|
|
You can post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
| |
|
|