Help!

[Announce] 2.6.29-rc4-rt1

 
  

Goto page 1, 2, 3 ... 9, 10, 11
Post new topic   General Reply to Topic (not reply to a specific post)    Forums Home -> Kernel RSS
Next:  [PATCH 2.6.30] RDMA/cxgb3: remove modulo math fro..  
Author Message
Thomas Gleixner
External


Since: Feb 11, 2009
Posts: 1



PostPosted: Wed Feb 11, 2009 7:10 pm    Post subject: [Announce] 2.6.29-rc4-rt1
Archived from groups: linux>kernel (more info?)

After a 1.5 years sabbatical from preempt-rt we are pleased to
announce a refactored preempt-rt patch against linux-2.6.29-rc4.

The patch is working on x86 (32 and 64bit) but we have not yet updated
ARM, PPC and MIPS (work in progress).

We also dropped some experimental features of the base preempt-rt
queue 2.6.26.8-rt15 simply because we wanted to survive the forward
port over 3 kernel releases with the least amount of surprises. These
features (e.g. multiple reader PI locks) are not essential for the
preempt-rt functionality and need some serious overhaul anyway.

The interested -rt observer might have noticed that we based our work
on the 2.6.26.8-rt15 patch queue and did not pick the git-rt tree
which is based on 2.6.28. The reason for this is that we wanted to pick
the most stable patch queue and the git-rt tree has a lot of rewritten
new code. Our work is not making the work which was done over the last
months in the git-rt tree obsolete, quite the contrary: we want to
provide a stable yet latest-kernel based foundation and integrate those
changes gradually, as they become ready.

The further plan for the new -rt series is to merge it fully into git
and integrate it into the -tip git tree so it gets the same treatment
as all of our -tip based work: fully automated compile and boot
testing. Furthermore an automated multi architecture -rt performance
regression test based on the same infrastructure is currently being
built.

The integration into the -tip tree also allows us to seperate out parts
of -rt which are ready for mainline more easily and integrate them
with our usual propagation to mainline.

The structure of the patches is likely to change over the next days
when we tackle the git integration, but we appreciate your feedback in
the form of comments, bugreports and patches.

Enough said. You can find the new patches at the following location:

http://www.kernel.org/pub/linux/kernel/projects/rt/

Information on the RT patch can be found at:

http://rt.wiki.kernel.org/index.php/Main_Page

to build the 2.6.29-rc4-rt1 tree, the following patches should be
applied:

http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.29-rc4.tar.bz2
http://www.kernel.org/pub/linux/kernel/projects/rt/patch-2.6.29-rc4-rt1.bz2

The broken out patches are also available at the same download
location.

Enjoy !

Thomas, Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Frederic Weisbecker
External


Since: Jan 17, 2009
Posts: 267



PostPosted: Wed Feb 11, 2009 10:10 pm    Post subject: Re: [Announce] 2.6.29-rc4-rt1 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, Feb 12, 2009 at 01:50:32AM +0100, Frederic Weisbecker wrote:
> On Wed, Feb 11, 2009 at 11:43:44PM +0100, Thomas Gleixner wrote:
> > After a 1.5 years sabbatical from preempt-rt we are pleased to
> > announce a refactored preempt-rt patch against linux-2.6.29-rc4.
> >
> > The patch is working on x86 (32 and 64bit) but we have not yet updated
> > ARM, PPC and MIPS (work in progress).
> >
> > We also dropped some experimental features of the base preempt-rt
> > queue 2.6.26.8-rt15 simply because we wanted to survive the forward
> > port over 3 kernel releases with the least amount of surprises. These
> > features (e.g. multiple reader PI locks) are not essential for the
> > preempt-rt functionality and need some serious overhaul anyway.
> >
> > The interested -rt observer might have noticed that we based our work
> > on the 2.6.26.8-rt15 patch queue and did not pick the git-rt tree
> > which is based on 2.6.28. The reason for this is that we wanted to pick
> > the most stable patch queue and the git-rt tree has a lot of rewritten
> > new code. Our work is not making the work which was done over the last
> > months in the git-rt tree obsolete, quite the contrary: we want to
> > provide a stable yet latest-kernel based foundation and integrate those
> > changes gradually, as they become ready.
> >
> > The further plan for the new -rt series is to merge it fully into git
> > and integrate it into the -tip git tree so it gets the same treatment
> > as all of our -tip based work: fully automated compile and boot
> > testing. Furthermore an automated multi architecture -rt performance
> > regression test based on the same infrastructure is currently being
> > built.
> >
> > The integration into the -tip tree also allows us to seperate out parts
> > of -rt which are ready for mainline more easily and integrate them
> > with our usual propagation to mainline.
> >
> > The structure of the patches is likely to change over the next days
> > when we tackle the git integration, but we appreciate your feedback in
> > the form of comments, bugreports and patches.
> >
>
>
> Hi!
>
> I get some sleep while atomic warnings.
> I've put the log and my config in attachment.
>
>

Note, it's a wicked bug: I can't reproduce it anymore.
I would have been glad to give you an irqsoff trace but I can't Smile

Oh yes I have two other warnings, for the second one, I'm not sure
this is really only present in -rt.

The first one, a lockdep warning:

[ 2.975320] ---------------------------------
[ 2.975320] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
[ 2.975320] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 2.975320] (per_cpu__lock_slab_irq_locks_locked#2){+-..}, at: [<ffffffff802f6823>] kfree+0x43/0xc0
[ 2.975320] {hardirq-on-W} state was registered at:
[ 2.975320] [<ffffffff802822c5>] __lock_acquire+0x6f5/0x1b20
[ 2.975320] [<ffffffff8028378f>] lock_acquire+0x9f/0xe0
[ 2.975320] [<ffffffff8077c7f5>] rt_spin_lock+0x85/0xb0
[ 2.975320] [<ffffffff802f8251>] kmem_cache_alloc+0x51/0x1f0
[ 2.975320] [<ffffffff80253eab>] copy_process+0x9b/0x1500
[ 2.975320] [<ffffffff802553a0>] do_fork+0x90/0x4a0
[ 2.975320] [<ffffffff80213552>] kernel_thread+0x82/0xe0
[ 2.975320] [<ffffffff802135ba>] child_rip+0xa/0x20
[ 2.975320] [<ffffffffffffffff>] 0xffffffffffffffff
[ 2.975320] irq event stamp: 18114
[ 2.975320] hardirqs last enabled at (18113): [<ffffffff8021a935>] default_idle+0x55/0x60
[ 2.975320] hardirqs last disabled at (18114): [<ffffffff8021222a>] save_args+0x6a/0x70
[ 2.975320] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[ 2.975320] softirqs last disabled at (0): [<(null)>] (null)
[ 2.975320]
[ 2.975320] other info that might help us debug this:
[ 2.975320] no locks held by swapper/0.
[ 2.975320]
[ 2.975320] stack backtrace:
[ 2.975320] Pid: 0, comm: swapper Not tainted 2.6.29-rc4-rt1-tip #1
[ 2.975320] Call Trace:
[ 2.975320] <IRQ> [<ffffffff8027fcbc>] print_usage_bug+0x19c/0x200
[ 2.975320] [<ffffffff8021e6af>] ? save_stack_trace+0x2f/0x50
[ 2.975320] [<ffffffff80280315>] mark_lock+0x2a5/0xcd0
[ 2.975320] [<ffffffff8028245e>] __lock_acquire+0x88e/0x1b20
[ 2.975320] [<ffffffff8028216b>] ? __lock_acquire+0x59b/0x1b20
[ 2.975320] [<ffffffff802243e4>] ? post_set+0x64/0x70
[ 2.975320] [<ffffffff8028378f>] lock_acquire+0x9f/0xe0
[ 2.975320] [<ffffffff802f6823>] ? kfree+0x43/0xc0
[ 2.975320] [<ffffffff8077c7f5>] rt_spin_lock+0x85/0xb0
[ 2.975320] [<ffffffff802f6823>] ? kfree+0x43/0xc0
[ 2.975320] [<ffffffff802f6823>] kfree+0x43/0xc0
[ 2.975320] [<ffffffff8027f2fd>] ? trace_hardirqs_off+0xd/0x10
[ 2.975320] [<ffffffff8028a156>] generic_smp_call_function_single_interrupt+0x106/0x110
[ 2.975320] [<ffffffff80227144>] smp_call_function_single_interrupt+0x24/0x40
[ 2.975320] [<ffffffff80213223>] call_function_single_interrupt+0x13/0x20
[ 2.975320] <EOI> [<ffffffff8022df7b>] ? native_safe_halt+0xb/0x10
[ 2.975320] [<ffffffff8022df79>] ? native_safe_halt+0x9/0x10
[ 2.975320] [<ffffffff8021a93a>] ? default_idle+0x5a/0x60
[ 2.975320] [<ffffffff8021136e>] ? cpu_idle+0x7e/0x100
[ 2.975320] [<ffffffff80775b6c>] ? start_secondary+0x197/0x1eb


The second, a sysfs warning:

[ 8.042459] ------------[ cut here ]------------
[ 8.054763] WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x51/0x60()
[ 8.066777] Hardware name: AMILO Li 2727
[ 8.078555] sysfs: duplicate filename '14:4' can not be created
[ 8.090353] Pid: 33, comm: work_on_cpu/0 Not tainted 2.6.29-rc4-rt1-tip #1
[ 8.102482] Call Trace:
[ 8.102492] [<ffffffff80255ea3>] warn_slowpath+0xd3/0x130
[ 8.102500] [<ffffffff8077cbeb>] ? _mutex_unlock+0x2b/0x40
[ 8.102508] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 8.102513] [<ffffffff8035c835>] ? sysfs_find_dirent+0x35/0x50
[ 8.102518] [<ffffffff8035c9f4>] ? __sysfs_add_one+0x24/0xe0
[ 8.102523] [<ffffffff8035cb01>] sysfs_add_one+0x51/0x60
[ 8.102528] [<ffffffff8035dcb3>] sysfs_do_create_link+0x103/0x170
[ 8.102533] [<ffffffff8035dd53>] sysfs_create_link+0x13/0x20
[ 8.102540] [<ffffffff8051d739>] device_add+0x209/0x620
[ 8.102547] [<ffffffff80289128>] ? __rt_spin_lock_init+0x48/0x60
[ 8.102552] [<ffffffff8051db6e>] device_register+0x1e/0x30
[ 8.102557] [<ffffffff8051dc64>] device_create_vargs+0xe4/0x100
[ 8.102563] [<ffffffff8051dcd0>] device_create+0x50/0x60
[ 8.102570] [<ffffffff80612a25>] ? sound_insert_unit+0x55/0x1e0
[ 8.102575] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
[ 8.102579] [<ffffffff80612a25>] ? sound_insert_unit+0x55/0x1e0
[ 8.102584] [<ffffffff80612afa>] sound_insert_unit+0x12a/0x1e0
[ 8.102590] [<ffffffff80612d35>] register_sound_special_device+0xa5/0x220
[ 8.102595] [<ffffffff8077c112>] ? rt_mutex_lock+0x22/0x60
[ 8.102601] [<ffffffff80625499>] snd_register_oss_device+0x239/0x2c0
[ 8.102608] [<ffffffff8063b2b0>] register_oss_dsp+0x60/0x90
[ 8.102613] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 8.102618] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 8.102623] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 8.102629] [<ffffffff8063f5e7>] snd_pcm_oss_register_minor+0x167/0x260
[ 8.102634] [<ffffffff80627955>] ? snd_timer_dev_register+0x35/0x130
[ 8.102640] [<ffffffff80627a0b>] ? snd_timer_dev_register+0xeb/0x130
[ 8.102645] [<ffffffff80624855>] ? snd_device_register+0x65/0x80
[ 8.102650] [<ffffffff806365af>] ? snd_pcm_timer_init+0x14f/0x1a0
[ 8.102656] [<ffffffff8062b79d>] snd_pcm_dev_register+0x1ad/0x2c0
[ 8.102661] [<ffffffff80212000>] ? sys_rt_sigreturn+0x250/0x290
[ 8.102667] [<ffffffff806247c9>] snd_device_register_all+0x39/0x60
[ 8.102672] [<ffffffff8061f25a>] snd_card_register+0x3a/0x3d0
[ 8.102678] [<ffffffff8076e5a6>] azx_probe+0x7b6/0xa90
[ 8.102685] [<ffffffff80677960>] ? azx_send_cmd+0x0/0x120
[ 8.102690] [<ffffffff80677a80>] ? azx_get_response+0x0/0x240
[ 8.102695] [<ffffffff80676eb0>] ? azx_attach_pcm_stream+0x0/0x1c0
[ 8.102701] [<ffffffff8026a5c0>] ? do_work_for_cpu+0x0/0x30
[ 8.102708] [<ffffffff80455847>] local_pci_probe+0x17/0x20
[ 8.102713] [<ffffffff8026a5d8>] do_work_for_cpu+0x18/0x30
[ 8.102718] [<ffffffff8026a88d>] run_workqueue+0x16d/0x2c0
[ 8.102722] [<ffffffff8026a83a>] ? run_workqueue+0x11a/0x2c0
[ 8.102727] [<ffffffff8026aa8f>] worker_thread+0xaf/0x130
[ 8.102733] [<ffffffff8026f5f0>] ? autoremove_wake_function+0x0/0x40
[ 8.102738] [<ffffffff8026a9e0>] ? worker_thread+0x0/0x130
[ 8.102742] [<ffffffff8026a9e0>] ? worker_thread+0x0/0x130
[ 8.102747] [<ffffffff8026f0ee>] kthread+0x4e/0x90
[ 8.102752] [<ffffffff802135ba>] child_rip+0xa/0x20
[ 8.102757] [<ffffffff80212f54>] ? restore_args+0x0/0x30
[ 8.102762] [<ffffffff8026f0a0>] ? kthread+0x0/0x90
[ 8.102766] [<ffffffff802135b0>] ? child_rip+0x0/0x20
[ 8.102777] ---[ end trace 57b9b5741e12ebf7 ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 4:10 am    Post subject: Re: [Announce] 2.6.29-rc4-rt1 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Frederic Weisbecker <fweisbec DeleteThis @gmail.com> wrote:

> The first one, a lockdep warning:
>
> [ 2.975320] [<ffffffff8077c7f5>] rt_spin_lock+0x85/0xb0
> [ 2.975320] [<ffffffff802f6823>] ? kfree+0x43/0xc0
> [ 2.975320] [<ffffffff802f6823>] kfree+0x43/0xc0
> [ 2.975320] [<ffffffff8027f2fd>] ? trace_hardirqs_off+0xd/0x10
> [ 2.975320] [<ffffffff8028a156>] generic_smp_call_function_single_interrupt+0x106/0x110
> [ 2.975320] [<ffffffff80227144>] smp_call_function_single_interrupt+0x24/0x40
> [ 2.975320] [<ffffffff80213223>] call_function_single_interrupt+0x13/0x20
> [ 2.975320] <EOI> [<ffffffff8022df7b>] ? native_safe_halt+0xb/0x10
> [ 2.975320] [<ffffffff8022df79>] ? native_safe_halt+0x9/0x10
> [ 2.975320] [<ffffffff8021a93a>] ? default_idle+0x5a/0x60
> [ 2.975320] [<ffffffff8021136e>] ? cpu_idle+0x7e/0x100
> [ 2.975320] [<ffffffff80775b6c>] ? start_secondary+0x197/0x1eb

hm, that's a complex one - we do kfree() from IPI context, dunno why this
never triggered in our testing of -rt1.

> The second, a sysfs warning:
>
> [ 8.042459] ------------[ cut here ]------------
> [ 8.054763] WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x51/0x60()
> [ 8.066777] Hardware name: AMILO Li 2727
> [ 8.078555] sysfs: duplicate filename '14:4' can not be created

that should be harmless - unless you only get it under -rt1.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 4:10 am    Post subject: [patch] rt: fix ipi kfree(), introduce IPI_SOFTIRQ [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:

> hm, that's a complex one - we do kfree() from IPI context, [...]

The patch below might do the trick - it offloads this to a softirq.
Not tested yet.

Ingo

------------------>
Subject: rt: fix ipi kfree(), introduce IPI_SOFTIRQ
From: Ingo Molnar <mingo.DeleteThis@elte.hu>
Date: Thu Feb 12 09:06:11 CET 2009

in 2.6.28 generic_smp_call_function_interrupt() grew a kfree(),
which is a rather complex, sleepable method under -rt. But the
IPI code runs as a hardirq - which cannot run such code.

So defer this work to a softirq context instead. It still stays
on the same CPU so the percpu IPI assumptions are upheld.

Signed-off-by: Ingo Molnar <mingo.DeleteThis@elte.hu>
---
include/linux/interrupt.h | 1 +
include/linux/smp.h | 3 +++
init/main.c | 1 +
kernel/smp.c | 15 +++++++++++++--
4 files changed, 18 insertions(+), 2 deletions(-)

Index: tip/include/linux/interrupt.h
===================================================================
--- tip.orig/include/linux/interrupt.h
+++ tip/include/linux/interrupt.h
@@ -257,6 +257,7 @@ enum
SCHED_SOFTIRQ,
HRTIMER_SOFTIRQ,
RCU_SOFTIRQ, /* Preferable RCU should always be the last softirq */
+ IPI_SOFTIRQ,
/* Entries after this are ignored in split softirq mode */
MAX_SOFTIRQ,

Index: tip/include/linux/smp.h
===================================================================
--- tip.orig/include/linux/smp.h
+++ tip/include/linux/smp.h
@@ -104,6 +104,9 @@ void ipi_call_lock(void);
void ipi_call_unlock(void);
void ipi_call_lock_irq(void);
void ipi_call_unlock_irq(void);
+void ipi_init(void);
+#else
+static inline void ipi_init(void) { }
#endif

/*
Index: tip/init/main.c
===================================================================
--- tip.orig/init/main.c
+++ tip/init/main.c
@@ -606,6 +606,7 @@ asmlinkage void __init start_kernel(void
/* init some links before init_ISA_irqs() */
early_irq_init();
init_IRQ();
+ ipi_init();
pidhash_init();
init_timers();
hrtimers_init();
Index: tip/kernel/smp.c
===================================================================
--- tip.orig/kernel/smp.c
+++ tip/kernel/smp.c
@@ -4,12 +4,13 @@
* (C) Jens Axboe <jens.axboe.DeleteThis@oracle.com> 2008
*
*/
+#include <linux/smp.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/percpu.h>
-#include <linux/rcupdate.h>
#include <linux/rculist.h>
-#include <linux/smp.h>
+#include <linux/rcupdate.h>
+#include <linux/interrupt.h>

static DEFINE_PER_CPU(struct call_single_queue, call_single_queue);
static LIST_HEAD(call_function_queue);
@@ -152,6 +153,11 @@ void generic_smp_call_function_interrupt
*/
void generic_smp_call_function_single_interrupt(void)
{
+ raise_softirq_irqoff(IPI_SOFTIRQ);
+}
+
+static void run_generic_smp_call_function_single(struct softirq_action *h)
+{
struct call_single_queue *q = &__get_cpu_var(call_single_queue);
LIST_HEAD(list);

@@ -430,3 +436,8 @@ void ipi_call_unlock_irq(void)
{
spin_unlock_irq(&call_function_lock);
}
+
+void __init ipi_init(void)
+{
+ open_softirq(IPI_SOFTIRQ, run_generic_smp_call_function_single);
+}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 4:10 am    Post subject: Re: [patch] rt: fix ipi kfree(), introduce IPI_SOFTIRQ [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Ingo Molnar <mingo DeleteThis @elte.hu> wrote:

> ------------------>
> Subject: rt: fix ipi kfree(), introduce IPI_SOFTIRQ
> From: Ingo Molnar <mingo DeleteThis @elte.hu>
> Date: Thu Feb 12 09:06:11 CET 2009
>
> in 2.6.28 generic_smp_call_function_interrupt() grew a kfree(),
> which is a rather complex, sleepable method under -rt. But the
> IPI code runs as a hardirq - which cannot run such code.
>
> So defer this work to a softirq context instead. It still stays
> on the same CPU so the percpu IPI assumptions are upheld.

On a second thought ...

I think we could eliminate the kfree() instead, and keep the
atomicity of IPI cross-calls. Linus expressed doubts about
the IPI kmalloc()/kfree() pair we do in the generic SMP IPI
code, suggesting that it probably does not help performance
all that much - so such a change might be upstream-able as well
and would keep -rt closer to mainline.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 4:10 am    Post subject: Re: [patch] rt: fix ipi kfree(), introduce IPI_SOFTIRQ [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Ingo Molnar <mingo.TakeThisOut@elte.hu> wrote:

> * Ingo Molnar <mingo.TakeThisOut@elte.hu> wrote:
>
> > hm, that's a complex one - we do kfree() from IPI context, [...]
>
> The patch below might do the trick - it offloads this to a softirq.
> Not tested yet.

ok, it's lightly tested now: it built and booted up fine to X with your config.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Peter Zijlstra
External


Since: Jun 06, 2007
Posts: 205



PostPosted: Thu Feb 12, 2009 5:10 am    Post subject: Re: [patch] rt: fix ipi kfree(), introduce IPI_SOFTIRQ [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 2009-02-12 at 09:19 +0100, Ingo Molnar wrote:
> * Ingo Molnar <mingo DeleteThis @elte.hu> wrote:
>
> > hm, that's a complex one - we do kfree() from IPI context, [...]
>
> The patch below might do the trick - it offloads this to a softirq.
> Not tested yet.

The simple fix is something like:

---
kernel/smp.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index bbedbb7..9b974c1 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -252,7 +252,11 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
* will make sure the callee is done with the
* data before a new caller will use it.
*/
+#ifndef CONFIG_PREEMPT_RT
data = kmalloc(sizeof(*data), GFP_ATOMIC);
+#else
+ data = NULL;
+#endif
if (data)
data->flags = CSD_FLAG_ALLOC;
else {
@@ -347,7 +351,11 @@ void smp_call_function_many(const struct cpumask *mask,
return;
}

+#ifndef CONFIG_PREEMPT_RT
data = kmalloc(sizeof(*data) + cpumask_size(), GFP_ATOMIC);
+#else
+ data = NULL;
+#endif
if (unlikely(!data)) {
/* Slow path. */
for_each_online_cpu(cpu) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 6:10 am    Post subject: [patch] generic-ipi: remove kmalloc, cleanup [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Peter Zijlstra <peterz.DeleteThis@infradead.org> wrote:

> On Thu, 2009-02-12 at 09:19 +0100, Ingo Molnar wrote:
> > * Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:
> >
> > > hm, that's a complex one - we do kfree() from IPI context, [...]
> >
> > The patch below might do the trick - it offloads this to a softirq.
> > Not tested yet.
>
> The simple fix is something like:
>
> ---
> kernel/smp.c | 8 ++++++++
> 1 files changed, 8 insertions(+), 0 deletions(-)

ok, i made it unconditional (not just a PREEMPT_RT hac) and did the
cleanup below on top of it.

I dont think repeat, queued IPIs are all that interesting from a
performance point of view. If they are, it will all be clearly
bisectable.

Ingo

--------------->
Subject: generic-ipi: remove kmalloc, cleanup
From: Ingo Molnar <mingo.DeleteThis@elte.hu>

Now that we dont use the kmalloc() sequence anymore, remove
CSD_FLAG_ALLOC and all its dependencies.

Signed-off-by: Ingo Molnar <mingo.DeleteThis@elte.hu>
---
kernel/smp.c | 86 +++++++++++------------------------------------------------
1 file changed, 17 insertions(+), 69 deletions(-)

Index: tip/kernel/smp.c
===================================================================
--- tip.orig/kernel/smp.c
+++ tip/kernel/smp.c
@@ -17,8 +17,7 @@ __cacheline_aligned_in_smp DEFINE_RAW_SP

enum {
CSD_FLAG_WAIT = 0x01,
- CSD_FLAG_ALLOC = 0x02,
- CSD_FLAG_LOCK = 0x04,
+ CSD_FLAG_LOCK = 0x02,
};

struct call_function_data {
@@ -85,15 +84,6 @@ static void generic_exec_single(int cpu,
csd_flag_wait(data);
}

-static void rcu_free_call_data(struct rcu_head *head)
-{
- struct call_function_data *data;
-
- data = container_of(head, struct call_function_data, rcu_head);
-
- kfree(data);
-}
-
/*
* Invoked by arch to handle an IPI for call function. Must be called with
* interrupts disabled.
@@ -138,8 +128,6 @@ void generic_smp_call_function_interrupt
smp_wmb();
data->csd.flags &= ~CSD_FLAG_WAIT;
}
- if (data->csd.flags & CSD_FLAG_ALLOC)
- call_rcu(&data->rcu_head, rcu_free_call_data);
}
rcu_read_unlock();

@@ -190,8 +178,7 @@ void generic_smp_call_function_single_in
} else if (data_flags & CSD_FLAG_LOCK) {
smp_wmb();
data->flags &= ~CSD_FLAG_LOCK;
- } else if (data_flags & CSD_FLAG_ALLOC)
- kfree(data);
+ }
}
/*
* See comment on outer loop
@@ -236,13 +223,11 @@ int smp_call_function_single(int cpu, vo
/*
* We are calling a function on a single CPU
* and we are not going to wait for it to finish.
- * We first try to allocate the data, but if we
- * fail, we fall back to use a per cpu data to pass
- * the information to that CPU. Since all callers
- * of this code will use the same data, we must
- * synchronize the callers to prevent a new caller
- * from corrupting the data before the callee
- * can access it.
+ * We use a per cpu data to pass the information to
+ * that CPU. Since all callers of this code will use
+ * the same data, we must synchronize the callers to
+ * prevent a new caller from corrupting the data before
+ * the callee can access it.
*
* The CSD_FLAG_LOCK is used to let us know when
* the IPI handler is done with the data.
@@ -252,15 +237,10 @@ int smp_call_function_single(int cpu, vo
* will make sure the callee is done with the
* data before a new caller will use it.
*/
- data = NULL;
- if (data)
- data->flags = CSD_FLAG_ALLOC;
- else {
- data = &per_cpu(csd_data, me);
- while (data->flags & CSD_FLAG_LOCK)
- cpu_relax();
- data->flags = CSD_FLAG_LOCK;
- }
+ data = &per_cpu(csd_data, me);
+ while (data->flags & CSD_FLAG_LOCK)
+ cpu_relax();
+ data->flags = CSD_FLAG_LOCK;
} else {
data = &d;
data->flags = CSD_FLAG_WAIT;
@@ -321,8 +301,6 @@ void smp_call_function_many(const struct
void (*func)(void *), void *info,
bool wait)
{
- struct call_function_data *data;
- unsigned long flags;
int cpu, next_cpu;

/* Can deadlock when called with interrupts disabled */
@@ -347,43 +325,13 @@ void smp_call_function_many(const struct
return;
}

- data = NULL;
- if (unlikely(!data)) {
- /* Slow path. */
- for_each_online_cpu(cpu) {
- if (cpu == smp_processor_id())
- continue;
- if (cpumask_test_cpu(cpu, mask))
- smp_call_function_single(cpu, func, info, wait);
- }
- return;
+ /* Slow path. */
+ for_each_online_cpu(cpu) {
+ if (cpu == smp_processor_id())
+ continue;
+ if (cpumask_test_cpu(cpu, mask))
+ smp_call_function_single(cpu, func, info, wait);
}
-
- spin_lock_init(&data->lock);
- data->csd.flags = CSD_FLAG_ALLOC;
- if (wait)
- data->csd.flags |= CSD_FLAG_WAIT;
- data->csd.func = func;
- data->csd.info = info;
- cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
- cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
- data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
-
- spin_lock_irqsave(&call_function_lock, flags);
- list_add_tail_rcu(&data->csd.list, &call_function_queue);
- spin_unlock_irqrestore(&call_function_lock, flags);
-
- /*
- * Make the list addition visible before sending the ipi.
- */
- smp_mb();
-
- /* Send a message to all CPUs in the map */
- arch_send_call_function_ipi_mask(to_cpumask(data->cpumask_bits));
-
- /* optionally wait for the CPUs to complete */
- if (wait)
- csd_flag_wait(&data->csd);
}
EXPORT_SYMBOL(smp_call_function_many);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Peter Zijlstra
External


Since: Jun 06, 2007
Posts: 205



PostPosted: Thu Feb 12, 2009 6:10 am    Post subject: Re: [patch] generic-ipi: remove kmalloc, cleanup [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 2009-02-12 at 11:07 +0100, Ingo Molnar wrote:
> * Peter Zijlstra <peterz.DeleteThis@infradead.org> wrote:
>
> > On Thu, 2009-02-12 at 09:19 +0100, Ingo Molnar wrote:
> > > * Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:
> > >
> > > > hm, that's a complex one - we do kfree() from IPI context, [...]
> > >
> > > The patch below might do the trick - it offloads this to a softirq.
> > > Not tested yet.
> >
> > The simple fix is something like:
> >
> > ---
> > kernel/smp.c | 8 ++++++++
> > 1 files changed, 8 insertions(+), 0 deletions(-)
>
> ok, i made it unconditional (not just a PREEMPT_RT hac) and did the
> cleanup below on top of it.
>
> I dont think repeat, queued IPIs are all that interesting from a
> performance point of view. If they are, it will all be clearly
> bisectable.

Right, except I really don't like the smp_call_function_many() slow path
that's now the only path.

Rusty did that I think, but he also had some idea on how to fix it, I
think it boiled down to sticking a count in the call data instead of the
full cpumask.

So I'd rather we first fix that code, and then remove the kmalloc
all-together like you propose here.

> Ingo
>
> --------------->
> Subject: generic-ipi: remove kmalloc, cleanup
> From: Ingo Molnar <mingo.DeleteThis@elte.hu>
>
> Now that we dont use the kmalloc() sequence anymore, remove
> CSD_FLAG_ALLOC and all its dependencies.
>
> Signed-off-by: Ingo Molnar <mingo.DeleteThis@elte.hu>
> ---
> kernel/smp.c | 86 +++++++++++------------------------------------------------
> 1 file changed, 17 insertions(+), 69 deletions(-)
>
> Index: tip/kernel/smp.c
> ===================================================================
> --- tip.orig/kernel/smp.c
> +++ tip/kernel/smp.c
> @@ -17,8 +17,7 @@ __cacheline_aligned_in_smp DEFINE_RAW_SP
>
> enum {
> CSD_FLAG_WAIT = 0x01,
> - CSD_FLAG_ALLOC = 0x02,
> - CSD_FLAG_LOCK = 0x04,
> + CSD_FLAG_LOCK = 0x02,
> };
>
> struct call_function_data {
> @@ -85,15 +84,6 @@ static void generic_exec_single(int cpu,
> csd_flag_wait(data);
> }
>
> -static void rcu_free_call_data(struct rcu_head *head)
> -{
> - struct call_function_data *data;
> -
> - data = container_of(head, struct call_function_data, rcu_head);
> -
> - kfree(data);
> -}
> -
> /*
> * Invoked by arch to handle an IPI for call function. Must be called with
> * interrupts disabled.
> @@ -138,8 +128,6 @@ void generic_smp_call_function_interrupt
> smp_wmb();
> data->csd.flags &= ~CSD_FLAG_WAIT;
> }
> - if (data->csd.flags & CSD_FLAG_ALLOC)
> - call_rcu(&data->rcu_head, rcu_free_call_data);
> }
> rcu_read_unlock();
>
> @@ -190,8 +178,7 @@ void generic_smp_call_function_single_in
> } else if (data_flags & CSD_FLAG_LOCK) {
> smp_wmb();
> data->flags &= ~CSD_FLAG_LOCK;
> - } else if (data_flags & CSD_FLAG_ALLOC)
> - kfree(data);
> + }
> }
> /*
> * See comment on outer loop
> @@ -236,13 +223,11 @@ int smp_call_function_single(int cpu, vo
> /*
> * We are calling a function on a single CPU
> * and we are not going to wait for it to finish.
> - * We first try to allocate the data, but if we
> - * fail, we fall back to use a per cpu data to pass
> - * the information to that CPU. Since all callers
> - * of this code will use the same data, we must
> - * synchronize the callers to prevent a new caller
> - * from corrupting the data before the callee
> - * can access it.
> + * We use a per cpu data to pass the information to
> + * that CPU. Since all callers of this code will use
> + * the same data, we must synchronize the callers to
> + * prevent a new caller from corrupting the data before
> + * the callee can access it.
> *
> * The CSD_FLAG_LOCK is used to let us know when
> * the IPI handler is done with the data.
> @@ -252,15 +237,10 @@ int smp_call_function_single(int cpu, vo
> * will make sure the callee is done with the
> * data before a new caller will use it.
> */
> - data = NULL;
> - if (data)
> - data->flags = CSD_FLAG_ALLOC;
> - else {
> - data = &per_cpu(csd_data, me);
> - while (data->flags & CSD_FLAG_LOCK)
> - cpu_relax();
> - data->flags = CSD_FLAG_LOCK;
> - }
> + data = &per_cpu(csd_data, me);
> + while (data->flags & CSD_FLAG_LOCK)
> + cpu_relax();
> + data->flags = CSD_FLAG_LOCK;
> } else {
> data = &d;
> data->flags = CSD_FLAG_WAIT;
> @@ -321,8 +301,6 @@ void smp_call_function_many(const struct
> void (*func)(void *), void *info,
> bool wait)
> {
> - struct call_function_data *data;
> - unsigned long flags;
> int cpu, next_cpu;
>
> /* Can deadlock when called with interrupts disabled */
> @@ -347,43 +325,13 @@ void smp_call_function_many(const struct
> return;
> }
>
> - data = NULL;
> - if (unlikely(!data)) {
> - /* Slow path. */
> - for_each_online_cpu(cpu) {
> - if (cpu == smp_processor_id())
> - continue;
> - if (cpumask_test_cpu(cpu, mask))
> - smp_call_function_single(cpu, func, info, wait);
> - }
> - return;
> + /* Slow path. */
> + for_each_online_cpu(cpu) {
> + if (cpu == smp_processor_id())
> + continue;
> + if (cpumask_test_cpu(cpu, mask))
> + smp_call_function_single(cpu, func, info, wait);
> }
> -
> - spin_lock_init(&data->lock);
> - data->csd.flags = CSD_FLAG_ALLOC;
> - if (wait)
> - data->csd.flags |= CSD_FLAG_WAIT;
> - data->csd.func = func;
> - data->csd.info = info;
> - cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
> - cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
> - data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
> -
> - spin_lock_irqsave(&call_function_lock, flags);
> - list_add_tail_rcu(&data->csd.list, &call_function_queue);
> - spin_unlock_irqrestore(&call_function_lock, flags);
> -
> - /*
> - * Make the list addition visible before sending the ipi.
> - */
> - smp_mb();
> -
> - /* Send a message to all CPUs in the map */
> - arch_send_call_function_ipi_mask(to_cpumask(data->cpumask_bits));
> -
> - /* optionally wait for the CPUs to complete */
> - if (wait)
> - csd_flag_wait(&data->csd);
> }
> EXPORT_SYMBOL(smp_call_function_many);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 6:10 am    Post subject: [patch] rt: res_counter fix [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Frederic Weisbecker <fweisbec.DeleteThis@gmail.com> wrote:

> > I get some sleep while atomic warnings.
> > I've put the log and my config in attachment.
>
> Note, it's a wicked bug: I can't reproduce it anymore.
> I would have been glad to give you an irqsoff trace but I can't Smile

i tried your config and after a few bootups the warning did trigger.

It's the new resource counter code. The IRQ flags disabling it does
seems a bit dubious to me. Peter, what do you think?

Frederic, could you try the patch below?

Ingo

----------->
Subject: rt: res_counter fix
From: Ingo Molnar <mingo.DeleteThis@elte.hu>
Date: Thu Feb 12 11:11:47 CET 2009

Frederic Weisbecker reported this warning:

[ 45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
[ 45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
[ 45.228576] INFO: lockdep is turned off.
[ 45.228580] irq event stamp: 0
[ 45.228583] hardirqs last enabled at (0): [<(null)>] (null)
[ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
[ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
[ 45.228622] Call Trace:
[ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
[ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
[ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
[ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
[ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
[ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
[ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
[ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
[ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
[ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
[ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
[ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
[ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
[ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70

The reason is the raw IRQ flag use of kernel/res_counter.c.

The irq flags tricks there seem a bit pointless: it cannot
protect the c->parent linkage because local_irq_save() is
only per CPU.

So replace it with _nort(). This code needs a second look.

Reported-by: Frederic Weisbecker <fweisbec.DeleteThis@gmail.com>
Signed-off-by: Ingo Molnar <mingo.DeleteThis@elte.hu>
---
kernel/res_counter.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

Index: tip/kernel/res_counter.c
===================================================================
--- tip.orig/kernel/res_counter.c
+++ tip/kernel/res_counter.c
@@ -43,7 +43,7 @@ int res_counter_charge(struct res_counte
struct res_counter *c, *u;

*limit_fail_at = NULL;
- local_irq_save(flags);
+ local_irq_save_nort(flags);
for (c = counter; c != NULL; c = c->parent) {
spin_lock(&c->lock);
ret = res_counter_charge_locked(c, val);
@@ -62,7 +62,7 @@ undo:
spin_unlock(&u->lock);
}
done:
- local_irq_restore(flags);
+ local_irq_restore_nort(flags);
return ret;
}

@@ -79,13 +79,13 @@ void res_counter_uncharge(struct res_cou
unsigned long flags;
struct res_counter *c;

- local_irq_save(flags);
+ local_irq_save_nort(flags);
for (c = counter; c != NULL; c = c->parent) {
spin_lock(&c->lock);
res_counter_uncharge_locked(c, val);
spin_unlock(&c->lock);
}
- local_irq_restore(flags);
+ local_irq_restore_nort(flags);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 6:10 am    Post subject: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* Ingo Molnar <mingo.RemoveThis@elte.hu> wrote:

> Frederic, could you try the patch below?

Please try v2 below - it might even build Wink

Ingo

------------------->
Subject: rt: res_counter fix
From: Ingo Molnar <mingo.RemoveThis@elte.hu>
Date: Thu Feb 12 11:11:47 CET 2009

Frederic Weisbecker reported this warning:

[ 45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
[ 45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
[ 45.228576] INFO: lockdep is turned off.
[ 45.228580] irq event stamp: 0
[ 45.228583] hardirqs last enabled at (0): [<(null)>] (null)
[ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
[ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
[ 45.228622] Call Trace:
[ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
[ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
[ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
[ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
[ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
[ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
[ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
[ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
[ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
[ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
[ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
[ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
[ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
[ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70

The reason is the raw IRQ flag use of kernel/res_counter.c.

The irq flags tricks there seem a bit pointless: it cannot
protect the c->parent linkage because local_irq_save() is
only per CPU.

So replace it with _nort(). This code needs a second look.

Reported-by: Frederic Weisbecker <fweisbec.RemoveThis@gmail.com>
Signed-off-by: Ingo Molnar <mingo.RemoveThis@elte.hu>
---
kernel/res_counter.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

Index: tip/kernel/res_counter.c
===================================================================
--- tip.orig/kernel/res_counter.c
+++ tip/kernel/res_counter.c
@@ -14,6 +14,7 @@
#include <linux/res_counter.h>
#include <linux/uaccess.h>
#include <linux/mm.h>
+#include <linux/interrupt.h>

void res_counter_init(struct res_counter *counter, struct res_counter *parent)
{
@@ -43,7 +44,7 @@ int res_counter_charge(struct res_counte
struct res_counter *c, *u;

*limit_fail_at = NULL;
- local_irq_save(flags);
+ local_irq_save_nort(flags);
for (c = counter; c != NULL; c = c->parent) {
spin_lock(&c->lock);
ret = res_counter_charge_locked(c, val);
@@ -62,7 +63,7 @@ undo:
spin_unlock(&u->lock);
}
done:
- local_irq_restore(flags);
+ local_irq_restore_nort(flags);
return ret;
}

@@ -79,13 +80,13 @@ void res_counter_uncharge(struct res_cou
unsigned long flags;
struct res_counter *c;

- local_irq_save(flags);
+ local_irq_save_nort(flags);
for (c = counter; c != NULL; c = c->parent) {
spin_lock(&c->lock);
res_counter_uncharge_locked(c, val);
spin_unlock(&c->lock);
}
- local_irq_restore(flags);
+ local_irq_restore_nort(flags);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
KAMEZAWA Hiroyuki
External


Since: May 17, 2006
Posts: 404



PostPosted: Thu Feb 12, 2009 7:10 am    Post subject: Re: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 12 Feb 2009 11:21:13 +0100
Ingo Molnar <mingo RemoveThis @elte.hu> wrote:

>
> * Ingo Molnar <mingo RemoveThis @elte.hu> wrote:
>
> > Frederic, could you try the patch below?
>
> Please try v2 below - it might even build Wink
>
> Ingo
>
> ------------------->
> Subject: rt: res_counter fix
> From: Ingo Molnar <mingo RemoveThis @elte.hu>
> Date: Thu Feb 12 11:11:47 CET 2009
>
> Frederic Weisbecker reported this warning:
>
> [ 45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
> [ 45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
> [ 45.228576] INFO: lockdep is turned off.
> [ 45.228580] irq event stamp: 0
> [ 45.228583] hardirqs last enabled at (0): [<(null)>] (null)
> [ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> [ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> [ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
> [ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
> [ 45.228622] Call Trace:
> [ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
> [ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
> [ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
> [ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
> [ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
> [ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
> [ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> [ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
> [ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
> [ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
> [ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
> [ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> [ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
> [ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
> [ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
> [ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70
>
> The reason is the raw IRQ flag use of kernel/res_counter.c.
>
> The irq flags tricks there seem a bit pointless: it cannot
> protect the c->parent linkage because local_irq_save() is
> only per CPU.
>
> So replace it with _nort(). This code needs a second look.
>
I'm sorry for no knowledge about RT. Could you teach me what
local_irq_save_nort() does ?

Hmm, how about just replacaing _irq() with preempt_disable()/enable() ?
xxx_nort() is better ?

AFAIK, these will not be called from irq context. (Added Balbir to CC:)

Regards,
-Kame


> Reported-by: Frederic Weisbecker <fweisbec RemoveThis @gmail.com>
> Signed-off-by: Ingo Molnar <mingo RemoveThis @elte.hu>
> ---
> kernel/res_counter.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: tip/kernel/res_counter.c
> ===================================================================
> --- tip.orig/kernel/res_counter.c
> +++ tip/kernel/res_counter.c
> @@ -14,6 +14,7 @@
> #include <linux/res_counter.h>
> #include <linux/uaccess.h>
> #include <linux/mm.h>
> +#include <linux/interrupt.h>
>
> void res_counter_init(struct res_counter *counter, struct res_counter *parent)
> {
> @@ -43,7 +44,7 @@ int res_counter_charge(struct res_counte
> struct res_counter *c, *u;
>
> *limit_fail_at = NULL;
> - local_irq_save(flags);
> + local_irq_save_nort(flags);
> for (c = counter; c != NULL; c = c->parent) {
> spin_lock(&c->lock);
> ret = res_counter_charge_locked(c, val);
> @@ -62,7 +63,7 @@ undo:
> spin_unlock(&u->lock);
> }
> done:
> - local_irq_restore(flags);
> + local_irq_restore_nort(flags);
> return ret;
> }
>
> @@ -79,13 +80,13 @@ void res_counter_uncharge(struct res_cou
> unsigned long flags;
> struct res_counter *c;
>
> - local_irq_save(flags);
> + local_irq_save_nort(flags);
> for (c = counter; c != NULL; c = c->parent) {
> spin_lock(&c->lock);
> res_counter_uncharge_locked(c, val);
> spin_unlock(&c->lock);
> }
> - local_irq_restore(flags);
> + local_irq_restore_nort(flags);
> }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo RemoveThis @vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 7:10 am    Post subject: [patch] sched: cpu hotplug fix [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

The fix below addresses a lockdep warning i triggered with your config.

(You could hit this if you try to use s2ram or hibernation or explicit
CPU hotplug ops.)

Ingo

-------------->
Subject: sched: cpu hotplug fix
From: Ingo Molnar <mingo.TakeThisOut@elte.hu>
Date: Thu Feb 12 11:35:40 CET 2009

rq_attach_root() does a kfree() with the runqueue lock held.

That's not a very wise move, fix it.

Signed-off-by: Ingo Molnar <mingo.TakeThisOut@elte.hu>
---
kernel/sched.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

Index: tip/kernel/sched.c
===================================================================
--- tip.orig/kernel/sched.c
+++ tip/kernel/sched.c
@@ -7619,20 +7619,26 @@ static void free_rootdomain(struct root_

static void rq_attach_root(struct rq *rq, struct root_domain *rd)
{
+ struct root_domain *old_rd = NULL;
unsigned long flags;

spin_lock_irqsave(&rq->lock, flags);

if (rq->rd) {
- struct root_domain *old_rd = rq->rd;
+ old_rd = rq->rd;

if (cpumask_test_cpu(rq->cpu, old_rd->online))
set_rq_offline(rq);

cpumask_clear_cpu(rq->cpu, old_rd->span);

- if (atomic_dec_and_test(&old_rd->refcount))
- free_rootdomain(old_rd);
+ /*
+ * If we dont want to free the old_rt yet then
+ * set old_rd to NULL to skip the freeing later
+ * in this function:
+ */
+ if (!atomic_dec_and_test(&old_rd->refcount))
+ old_rd = NULL;
}

atomic_inc(&rd->refcount);
@@ -7643,6 +7649,9 @@ static void rq_attach_root(struct rq *rq
set_rq_online(rq);

spin_unlock_irqrestore(&rq->lock, flags);
+
+ if (old_rd)
+ free_rootdomain(old_rd);
}

static int __init_refok init_rootdomain(struct root_domain *rd, bool bootmem)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Ingo Molnar
External


Since: May 15, 2006
Posts: 3111



PostPosted: Thu Feb 12, 2009 7:10 am    Post subject: Re: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

* KAMEZAWA Hiroyuki <kamezawa.hiroyu DeleteThis @jp.fujitsu.com> wrote:

> On Thu, 12 Feb 2009 11:21:13 +0100
> Ingo Molnar <mingo DeleteThis @elte.hu> wrote:
>
> >
> > * Ingo Molnar <mingo DeleteThis @elte.hu> wrote:
> >
> > > Frederic, could you try the patch below?
> >
> > Please try v2 below - it might even build Wink
> >
> > Ingo
> >
> > ------------------->
> > Subject: rt: res_counter fix
> > From: Ingo Molnar <mingo DeleteThis @elte.hu>
> > Date: Thu Feb 12 11:11:47 CET 2009
> >
> > Frederic Weisbecker reported this warning:
> >
> > [ 45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
> > [ 45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
> > [ 45.228576] INFO: lockdep is turned off.
> > [ 45.228580] irq event stamp: 0
> > [ 45.228583] hardirqs last enabled at (0): [<(null)>] (null)
> > [ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> > [ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> > [ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
> > [ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
> > [ 45.228622] Call Trace:
> > [ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
> > [ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
> > [ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
> > [ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
> > [ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
> > [ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
> > [ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> > [ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
> > [ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
> > [ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
> > [ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
> > [ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> > [ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
> > [ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
> > [ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
> > [ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70
> >
> > The reason is the raw IRQ flag use of kernel/res_counter.c.
> >
> > The irq flags tricks there seem a bit pointless: it cannot
> > protect the c->parent linkage because local_irq_save() is
> > only per CPU.
> >
> > So replace it with _nort(). This code needs a second look.
> >
> I'm sorry for no knowledge about RT. Could you teach me what
> local_irq_save_nort() does ?
>
> Hmm, how about just replacaing _irq() with preempt_disable()/enable() ?
> xxx_nort() is better ?
>
> AFAIK, these will not be called from irq context. (Added Balbir to CC:)

_nort() will just turn them into NOPs in essence.

The question is, are these local IRQ flags manipulations really needed
in this code, and if yes, why?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Peter Zijlstra
External


Since: Jun 06, 2007
Posts: 205



PostPosted: Thu Feb 12, 2009 8:10 am    Post subject: Re: [patch] generic-ipi: remove kmalloc, cleanup [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 2009-02-12 at 11:16 +0100, Peter Zijlstra wrote:
> On Thu, 2009-02-12 at 11:07 +0100, Ingo Molnar wrote:
> > * Peter Zijlstra <peterz.DeleteThis@infradead.org> wrote:
> >
> > > On Thu, 2009-02-12 at 09:19 +0100, Ingo Molnar wrote:
> > > > * Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:
> > > >
> > > > > hm, that's a complex one - we do kfree() from IPI context, [...]
> > > >
> > > > The patch below might do the trick - it offloads this to a softirq.
> > > > Not tested yet.
> > >
> > > The simple fix is something like:
> > >
> > > ---
> > > kernel/smp.c | 8 ++++++++
> > > 1 files changed, 8 insertions(+), 0 deletions(-)
> >
> > ok, i made it unconditional (not just a PREEMPT_RT hac) and did the
> > cleanup below on top of it.
> >
> > I dont think repeat, queued IPIs are all that interesting from a
> > performance point of view. If they are, it will all be clearly
> > bisectable.
>
> Right, except I really don't like the smp_call_function_many() slow path
> that's now the only path.
>
> Rusty did that I think, but he also had some idea on how to fix it, I
> think it boiled down to sticking a count in the call data instead of the
> full cpumask.
>
> So I'd rather we first fix that code, and then remove the kmalloc
> all-together like you propose here.

Right, I can't see a way around carrying that cpumask, there's just too
much that can go wrong without it.

So it put in unconditionally, how about this?


--
Subject: generic-smp: remove single ipi fallback for smp_call_function_many()

In preparation of removing the kmalloc() calls from the generic-ipi code
get rid of the single ipi fallback for smp_call_function_many().

Because we cannot get around carrying the cpumask in the data -- imagine
2 such calls with different but overlapping masks -- put in a full mask.

Signed-off-by: Peter Zijlstra <a.p.zijlstra.DeleteThis@chello.nl>
---
kernel/smp.c | 47 ++++++++++++++++++++++++-----------------------
1 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index bbedbb7..e6658e5 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -26,7 +26,7 @@ struct call_function_data {
spinlock_t lock;
unsigned int refs;
struct rcu_head rcu_head;
- unsigned long cpumask_bits[];
+ struct cpumask cpumask;
};

struct call_single_queue {
@@ -111,13 +111,13 @@ void generic_smp_call_function_interrupt(void)
list_for_each_entry_rcu(data, &call_function_queue, csd.list) {
int refs;

- if (!cpumask_test_cpu(cpu, to_cpumask(data->cpumask_bits)))
+ if (!cpumask_test_cpu(cpu, &data->cpumask))
continue;

data->csd.func(data->csd.info);

spin_lock(&data->lock);
- cpumask_clear_cpu(cpu, to_cpumask(data->cpumask_bits));
+ cpumask_clear_cpu(cpu, &data->cpumask);
WARN_ON(data->refs == 0);
data->refs--;
refs = data->refs;
@@ -137,8 +137,10 @@ void generic_smp_call_function_interrupt(void)
*/
smp_wmb();
data->csd.flags &= ~CSD_FLAG_WAIT;
- }
- if (data->csd.flags & CSD_FLAG_ALLOC)
+ } else if (data->csd.flags & CSD_FLAG_LOCK) {
+ smp_wmb();
+ data->csd.flags &= ~CSD_FLAG_LOCK;
+ } else if (data->csd.flags & CSD_FLAG_ALLOC)
call_rcu(&data->rcu_head, rcu_free_call_data);
}
rcu_read_unlock();
@@ -302,6 +304,8 @@ void __smp_call_function_single(int cpu, struct call_single_data *data)
arch_send_call_function_ipi(*(maskp))
#endif

+static DEFINE_PER_CPU(struct call_function_data, cfd_data);
+
/**
* smp_call_function_many(): Run a function on a set of other CPUs.
* @mask: The set of cpus to run on (only runs on online subset).
@@ -323,14 +327,14 @@ void smp_call_function_many(const struct cpumask *mask,
{
struct call_function_data *data;
unsigned long flags;
- int cpu, next_cpu;
+ int cpu, next_cpu, me = smp_processor_id();

/* Can deadlock when called with interrupts disabled */
WARN_ON(irqs_disabled());

/* So, what's a CPU they want? Ignoring this one. */
cpu = cpumask_first_and(mask, cpu_online_mask);
- if (cpu == smp_processor_id())
+ if (cpu == me)
cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
/* No online cpus? We're done. */
if (cpu >= nr_cpu_ids)
@@ -338,7 +342,7 @@ void smp_call_function_many(const struct cpumask *mask,

/* Do we have another CPU which isn't us? */
next_cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
- if (next_cpu == smp_processor_id())
+ if (next_cpu == me)
next_cpu = cpumask_next_and(next_cpu, mask, cpu_online_mask);

/* Fastpath: do that cpu by itself. */
@@ -347,27 +351,24 @@ void smp_call_function_many(const struct cpumask *mask,
return;
}

- data = kmalloc(sizeof(*data) + cpumask_size(), GFP_ATOMIC);
- if (unlikely(!data)) {
- /* Slow path. */
- for_each_online_cpu(cpu) {
- if (cpu == smp_processor_id())
- continue;
- if (cpumask_test_cpu(cpu, mask))
- smp_call_function_single(cpu, func, info, wait);
- }
- return;
+ data = kmalloc(sizeof(*data), GFP_ATOMIC);
+ if (data)
+ data->csd.flags = CSD_FLAG_ALLOC;
+ else {
+ data = &per_cpu(cfd_data, me);
+ while (data->csd.flags & CSD_FLAG_LOCK)
+ cpu_relax();
+ data->csd.flags = CSD_FLAG_LOCK;
}

spin_lock_init(&data->lock);
- data->csd.flags = CSD_FLAG_ALLOC;
if (wait)
data->csd.flags |= CSD_FLAG_WAIT;
data->csd.func = func;
data->csd.info = info;
- cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
- cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
- data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
+ cpumask_and(&data->cpumask, mask, cpu_online_mask);
+ cpumask_clear_cpu(smp_processor_id(), &data->cpumask);
+ data->refs = cpumask_weight(&data->cpumask);

spin_lock_irqsave(&call_function_lock, flags);
list_add_tail_rcu(&data->csd.list, &call_function_queue);
@@ -379,7 +380,7 @@ void smp_call_function_many(const struct cpumask *mask,
smp_mb();

/* Send a message to all CPUs in the map */
- arch_send_call_function_ipi_mask(to_cpumask(data->cpumask_bits));
+ arch_send_call_function_ipi_mask(&data->cpumask);

/* optionally wait for the CPUs to complete */
if (wait)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
KAMEZAWA Hiroyuki
External


Since: May 17, 2006
Posts: 404



PostPosted: Thu Feb 12, 2009 8:10 am    Post subject: Re: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 12 Feb 2009 12:28:54 +0100
Ingo Molnar <mingo.RemoveThis@elte.hu> wrote:

>
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu.RemoveThis@jp.fujitsu.com> wrote:
>
> > On Thu, 12 Feb 2009 11:21:13 +0100
> > Ingo Molnar <mingo.RemoveThis@elte.hu> wrote:
> >
> > >
> > > * Ingo Molnar <mingo.RemoveThis@elte.hu> wrote:
> > >
> > > > Frederic, could you try the patch below?
> > >
> > > Please try v2 below - it might even build Wink
> > >
> > > Ingo
> > >
> > > ------------------->
> > > Subject: rt: res_counter fix
> > > From: Ingo Molnar <mingo.RemoveThis@elte.hu>
> > > Date: Thu Feb 12 11:11:47 CET 2009
> > >
> > > Frederic Weisbecker reported this warning:
> > >
> > > [ 45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
> > > [ 45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
> > > [ 45.228576] INFO: lockdep is turned off.
> > > [ 45.228580] irq event stamp: 0
> > > [ 45.228583] hardirqs last enabled at (0): [<(null)>] (null)
> > > [ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> > > [ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> > > [ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
> > > [ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
> > > [ 45.228622] Call Trace:
> > > [ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
> > > [ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
> > > [ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
> > > [ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
> > > [ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
> > > [ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
> > > [ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> > > [ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
> > > [ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
> > > [ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
> > > [ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
> > > [ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> > > [ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
> > > [ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
> > > [ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
> > > [ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70
> > >
> > > The reason is the raw IRQ flag use of kernel/res_counter.c.
> > >
> > > The irq flags tricks there seem a bit pointless: it cannot
> > > protect the c->parent linkage because local_irq_save() is
> > > only per CPU.
> > >
> > > So replace it with _nort(). This code needs a second look.
> > >
> > I'm sorry for no knowledge about RT. Could you teach me what
> > local_irq_save_nort() does ?
> >
> > Hmm, how about just replacaing _irq() with preempt_disable()/enable() ?
> > xxx_nort() is better ?
> >
> > AFAIK, these will not be called from irq context. (Added Balbir to CC:)
>
> _nort() will just turn them into NOPs in essence.
>
> The question is, are these local IRQ flags manipulations really needed
> in this code, and if yes, why?
>
From my point of view, No. But original design of res_coutner used irqsave/restore().
The author may think of use this function from irq-context.

But now, it seems it's better to remove them, Hmm.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Peter Zijlstra
External


Since: Jun 06, 2007
Posts: 205



PostPosted: Thu Feb 12, 2009 8:10 am    Post subject: Re: [patch] generic-ipi: remove kmalloc, cleanup [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 2009-02-12 at 13:09 +0100, Peter Zijlstra wrote:


> @@ -137,8 +137,10 @@ void generic_smp_call_function_interrupt(void)
> */
> smp_wmb();
> data->csd.flags &= ~CSD_FLAG_WAIT;
> - }
> - if (data->csd.flags & CSD_FLAG_ALLOC)
> + } else if (data->csd.flags & CSD_FLAG_LOCK) {
> + smp_wmb();
> + data->csd.flags &= ~CSD_FLAG_LOCK;
> + } else if (data->csd.flags & CSD_FLAG_ALLOC)
> call_rcu(&data->rcu_head, rcu_free_call_data);
> }
> rcu_read_unlock();

Hmm, I think this bit ought to go in rcu_free_call_data(), otherwise we
can have that same race again..

I'd better put a comment in on why we use RCU here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Frederic Weisbecker
External


Since: Jan 17, 2009
Posts: 267



PostPosted: Thu Feb 12, 2009 10:10 am    Post subject: Re: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, Feb 12, 2009 at 11:21:13AM +0100, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo RemoveThis @elte.hu> wrote:
>
> > Frederic, could you try the patch below?
>
> Please try v2 below - it might even build Wink
>
> Ingo


Ok, I tested it through 3 bootups and it did not triggered.
But it's absolutely not a guarantee, as I said, I tried to reproduce it
with several bootups yesterday and I couln't see it again.

But by reading the warning and your patch, yes it is supposed to solve it Smile

BTW, a small thing:

> [ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> [ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500


The reason for which I wanted to send an irqsoff trace is that the above lines are false.

addr2line -e vmlinux ffffffff8025449d
/home/me/linux/rt/linux-2.6.29-rc4/kernel/fork.c:1107 (before your patch)

Which is this area in copy_process:

#ifdef CONFIG_NUMA
p->mempolicy = mpol_dup(p->mempolicy);
if (IS_ERR(p->mempolicy)) {
retval = PTR_ERR(p->mempolicy);
p->mempolicy = NULL;
goto bad_fork_cleanup_cgroup;
}
mpol_fix_fork_child_flag(p); // <-- 1107
#endif

Nothing here disables irq.
That's why I wanted to provide an irqsoff trace. But, heh I only had to read
res_counter_charge() Smile

Anyway, there is a problem with these hardirqs/softirqs last disabled...


> [ 45.228609] softirqs last disabled at (0): [<(null)>] (null)
> [ 45.228617] Pid: 4290, comm: ntpdate Tainted: G W 2.6.29-rc4-rt1-tip #1
> [ 45.228622] Call Trace:
> [ 45.228632] [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
> [ 45.228639] [<ffffffff8024cd73>] __might_sleep+0x113/0x130
> [ 45.228646] [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
> [ 45.228653] [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
> [ 45.228660] [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
> [ 45.228667] [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
> [ 45.228674] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> [ 45.228680] [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
> [ 45.228688] [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
> [ 45.228694] [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
> [ 45.228700] [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
> [ 45.228707] [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
> [ 45.228714] [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
> [ 45.228720] [<ffffffff8077e5a5>] page_fault+0x25/0x30
> [ 45.228727] [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
> [ 45.228733] [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70
>
> The reason is the raw IRQ flag use of kernel/res_counter.c.
>
> The irq flags tricks there seem a bit pointless: it cannot
> protect the c->parent linkage because local_irq_save() is
> only per CPU.
>
> So replace it with _nort(). This code needs a second look.
>
> Reported-by: Frederic Weisbecker <fweisbec RemoveThis @gmail.com>
> Signed-off-by: Ingo Molnar <mingo RemoveThis @elte.hu>
> ---
> kernel/res_counter.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: tip/kernel/res_counter.c
> ===================================================================
> --- tip.orig/kernel/res_counter.c
> +++ tip/kernel/res_counter.c
> @@ -14,6 +14,7 @@
> #include <linux/res_counter.h>
> #include <linux/uaccess.h>
> #include <linux/mm.h>
> +#include <linux/interrupt.h>
>
> void res_counter_init(struct res_counter *counter, struct res_counter *parent)
> {
> @@ -43,7 +44,7 @@ int res_counter_charge(struct res_counte
> struct res_counter *c, *u;
>
> *limit_fail_at = NULL;
> - local_irq_save(flags);
> + local_irq_save_nort(flags);
> for (c = counter; c != NULL; c = c->parent) {
> spin_lock(&c->lock);
> ret = res_counter_charge_locked(c, val);
> @@ -62,7 +63,7 @@ undo:
> spin_unlock(&u->lock);
> }
> done:
> - local_irq_restore(flags);
> + local_irq_restore_nort(flags);
> return ret;
> }
>
> @@ -79,13 +80,13 @@ void res_counter_uncharge(struct res_cou
> unsigned long flags;
> struct res_counter *c;
>
> - local_irq_save(flags);
> + local_irq_save_nort(flags);
> for (c = counter; c != NULL; c = c->parent) {
> spin_lock(&c->lock);
> res_counter_uncharge_locked(c, val);
> spin_unlock(&c->lock);
> }
> - local_irq_restore(flags);
> + local_irq_restore_nort(flags);
> }
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Frederic Weisbecker
External


Since: Jan 17, 2009
Posts: 267



PostPosted: Thu Feb 12, 2009 10:10 am    Post subject: Re: [patch] generic-ipi: remove kmalloc, cleanup [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, Feb 12, 2009 at 11:07:56AM +0100, Ingo Molnar wrote:
>
> * Peter Zijlstra <peterz.DeleteThis@infradead.org> wrote:
>
> > On Thu, 2009-02-12 at 09:19 +0100, Ingo Molnar wrote:
> > > * Ingo Molnar <mingo.DeleteThis@elte.hu> wrote:
> > >
> > > > hm, that's a complex one - we do kfree() from IPI context, [...]
> > >
> > > The patch below might do the trick - it offloads this to a softirq.
> > > Not tested yet.
> >
> > The simple fix is something like:
> >
> > ---
> > kernel/smp.c | 8 ++++++++
> > 1 files changed, 8 insertions(+), 0 deletions(-)
>
> ok, i made it unconditional (not just a PREEMPT_RT hac) and did the
> cleanup below on top of it.
>
> I dont think repeat, queued IPIs are all that interesting from a
> performance point of view. If they are, it will all be clearly
> bisectable.
>
> Ingo


Would you prefer I test this one or wait for everybody agree a final version?


> --------------->
> Subject: generic-ipi: remove kmalloc, cleanup
> From: Ingo Molnar <mingo.DeleteThis@elte.hu>
>
> Now that we dont use the kmalloc() sequence anymore, remove
> CSD_FLAG_ALLOC and all its dependencies.
>
> Signed-off-by: Ingo Molnar <mingo.DeleteThis@elte.hu>
> ---
> kernel/smp.c | 86 +++++++++++------------------------------------------------
> 1 file changed, 17 insertions(+), 69 deletions(-)
>
> Index: tip/kernel/smp.c
> ===================================================================
> --- tip.orig/kernel/smp.c
> +++ tip/kernel/smp.c
> @@ -17,8 +17,7 @@ __cacheline_aligned_in_smp DEFINE_RAW_SP
>
> enum {
> CSD_FLAG_WAIT = 0x01,
> - CSD_FLAG_ALLOC = 0x02,
> - CSD_FLAG_LOCK = 0x04,
> + CSD_FLAG_LOCK = 0x02,
> };
>
> struct call_function_data {
> @@ -85,15 +84,6 @@ static void generic_exec_single(int cpu,
> csd_flag_wait(data);
> }
>
> -static void rcu_free_call_data(struct rcu_head *head)
> -{
> - struct call_function_data *data;
> -
> - data = container_of(head, struct call_function_data, rcu_head);
> -
> - kfree(data);
> -}
> -
> /*
> * Invoked by arch to handle an IPI for call function. Must be called with
> * interrupts disabled.
> @@ -138,8 +128,6 @@ void generic_smp_call_function_interrupt
> smp_wmb();
> data->csd.flags &= ~CSD_FLAG_WAIT;
> }
> - if (data->csd.flags & CSD_FLAG_ALLOC)
> - call_rcu(&data->rcu_head, rcu_free_call_data);
> }
> rcu_read_unlock();
>
> @@ -190,8 +178,7 @@ void generic_smp_call_function_single_in
> } else if (data_flags & CSD_FLAG_LOCK) {
> smp_wmb();
> data->flags &= ~CSD_FLAG_LOCK;
> - } else if (data_flags & CSD_FLAG_ALLOC)
> - kfree(data);
> + }
> }
> /*
> * See comment on outer loop
> @@ -236,13 +223,11 @@ int smp_call_function_single(int cpu, vo
> /*
> * We are calling a function on a single CPU
> * and we are not going to wait for it to finish.
> - * We first try to allocate the data, but if we
> - * fail, we fall back to use a per cpu data to pass
> - * the information to that CPU. Since all callers
> - * of this code will use the same data, we must
> - * synchronize the callers to prevent a new caller
> - * from corrupting the data before the callee
> - * can access it.
> + * We use a per cpu data to pass the information to
> + * that CPU. Since all callers of this code will use
> + * the same data, we must synchronize the callers to
> + * prevent a new caller from corrupting the data before
> + * the callee can access it.
> *
> * The CSD_FLAG_LOCK is used to let us know when
> * the IPI handler is done with the data.
> @@ -252,15 +237,10 @@ int smp_call_function_single(int cpu, vo
> * will make sure the callee is done with the
> * data before a new caller will use it.
> */
> - data = NULL;
> - if (data)
> - data->flags = CSD_FLAG_ALLOC;
> - else {
> - data = &per_cpu(csd_data, me);
> - while (data->flags & CSD_FLAG_LOCK)
> - cpu_relax();
> - data->flags = CSD_FLAG_LOCK;
> - }
> + data = &per_cpu(csd_data, me);
> + while (data->flags & CSD_FLAG_LOCK)
> + cpu_relax();
> + data->flags = CSD_FLAG_LOCK;
> } else {
> data = &d;
> data->flags = CSD_FLAG_WAIT;
> @@ -321,8 +301,6 @@ void smp_call_function_many(const struct
> void (*func)(void *), void *info,
> bool wait)
> {
> - struct call_function_data *data;
> - unsigned long flags;
> int cpu, next_cpu;
>
> /* Can deadlock when called with interrupts disabled */
> @@ -347,43 +325,13 @@ void smp_call_function_many(const struct
> return;
> }
>
> - data = NULL;
> - if (unlikely(!data)) {
> - /* Slow path. */
> - for_each_online_cpu(cpu) {
> - if (cpu == smp_processor_id())
> - continue;
> - if (cpumask_test_cpu(cpu, mask))
> - smp_call_function_single(cpu, func, info, wait);
> - }
> - return;
> + /* Slow path. */
> + for_each_online_cpu(cpu) {
> + if (cpu == smp_processor_id())
> + continue;
> + if (cpumask_test_cpu(cpu, mask))
> + smp_call_function_single(cpu, func, info, wait);
> }
> -
> - spin_lock_init(&data->lock);
> - data->csd.flags = CSD_FLAG_ALLOC;
> - if (wait)
> - data->csd.flags |= CSD_FLAG_WAIT;
> - data->csd.func = func;
> - data->csd.info = info;
> - cpumask_and(to_cpumask(data->cpumask_bits), mask, cpu_online_mask);
> - cpumask_clear_cpu(smp_processor_id(), to_cpumask(data->cpumask_bits));
> - data->refs = cpumask_weight(to_cpumask(data->cpumask_bits));
> -
> - spin_lock_irqsave(&call_function_lock, flags);
> - list_add_tail_rcu(&data->csd.list, &call_function_queue);
> - spin_unlock_irqrestore(&call_function_lock, flags);
> -
> - /*
> - * Make the list addition visible before sending the ipi.
> - */
> - smp_mb();
> -
> - /* Send a message to all CPUs in the map */
> - arch_send_call_function_ipi_mask(to_cpumask(data->cpumask_bits));
> -
> - /* optionally wait for the CPUs to complete */
> - if (wait)
> - csd_flag_wait(&data->csd);
> }
> EXPORT_SYMBOL(smp_call_function_many);
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Peter Zijlstra
External


Since: Dec 16, 2006
Posts: 781



PostPosted: Thu Feb 12, 2009 11:10 am    Post subject: Re: [patch] rt: res_counter fix, v2 [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Thu, 2009-02-12 at 15:28 +0100, Frederic Weisbecker wrote:
> > [ 45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
> > [ 45.228602] softirqs last enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
>
>
> The reason for which I wanted to send an irqsoff trace is that the above lines are false.

copy_process() has:

#endif
p->hardirq_enable_ip = 0;
p->hardirq_enable_event = 0;
p->hardirq_disable_ip = _THIS_IP_;
p->hardirq_disable_event = 0;
p->softirqs_enabled = 1;
p->softirq_enable_ip = _THIS_IP_;
p->softirq_enable_event = 0;
p->softirq_disable_ip = 0;
p->softirq_disable_event = 0;
p->hardirq_context = 0;
p->softirq_context = 0;
#endif

the sequence count of 0 basically tells you it hasn't been set yet.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Back to top
Display posts from previous:   
Post new topic   General Reply to Topic (not reply to a specific post)    Forums Home -> Kernel All times are: Eastern Time (US & Canada) (change)
Goto page 1, 2, 3 ... 9, 10, 11
Page 1 of 11

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum