|
|
| Next: sbpcd.c: fix check_region to request_region |
| Author |
Message |
Arjan van de Ven External

Since: May 15, 2006 Posts: 901
|
Posted: Tue Sep 12, 2006 9:40 am Post subject: i386 PDA patches use of %gs Archived from groups: linux>kernel (more info?) |
|
|
Hi,
Userspace uses %gs for it's per thread data (and in modern linux
versions that means "all the time", errno is there for example).
On x86-64 this is the reason that the kernel uses the OTHER segment
register; so for the PDA patches this would mean using %fs and not %gs.
The advantage of this is very simple: %fs will be 0 for userspace most
of the time. Putting 0 in a segment register is cheap for the cpu,
putting anything else in is quite expensive (a LOT of security checks
need to happen). As such I would MUCH rather see that the i386 PDA
patches use %fs and not %gs...
Jeremy, is there a reason you're specifically using %gs and not %fs? If
not, would you mind a switch to using %fs instead?
Greetings,
Arjan van de Ven
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Tue Sep 12, 2006 9:50 am Post subject: Re: i386 PDA patches use of %gs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Arjan van de Ven wrote:
> Jeremy, is there a reason you're specifically using %gs and not %fs? If
> not, would you mind a switch to using %fs instead?
>
The main reason for using %gs was to take advantage of gcc's TLS
support. I intend to measure the cost of gs vs fs, and if there's a
significant difference I'll switch.
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Arjan van de Ven External

Since: May 15, 2006 Posts: 901
|
Posted: Tue Sep 12, 2006 10:00 am Post subject: Re: i386 PDA patches use of %gs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Tue, 2006-09-12 at 00:48 -0700, Jeremy Fitzhardinge wrote:
> Arjan van de Ven wrote:
> > Jeremy, is there a reason you're specifically using %gs and not %fs? If
> > not, would you mind a switch to using %fs instead?
> >
>
> The main reason for using %gs was to take advantage of gcc's TLS
> support. I intend to measure the cost of gs vs fs, and if there's a
> significant difference I'll switch.
gcc can be fixed if needed. I don't see the kernel switching to use that
any time soon though...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Tue Sep 12, 2006 10:40 am Post subject: Re: i386 PDA patches use of %gs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Arjan van de Ven wrote:
> gcc can be fixed if needed. I don't see the kernel switching to use that
> any time soon though...
I have a preliminary patch to implement per_cpu() in terms of __thread.
Hm, my initial tests comparing reloading a NULL selector vs a real
selector shows absolutely no measurable difference, on either a modern
Core Duo, or an old P4... Admittedly this is with an artificial
usermode test program, but I'd expect to see *some* difference if
there's a difference.
J
--
/* gcc -o time-segops time-segops.c -O2 -Wall -lrt -fomit-frame-pointer -funroll-loops */
#include <stdio.h>
#include <time.h>
#define COUNT 10000000
static inline void sync(void)
{
int a,b,c,d;
asm volatile("cpuid"
: "=a" (a), "=b" (b), "=c" (c), "=d" (d)
: "0" (0), "2" (0)
: "memory");
}
static void test_none(void)
{
int i;
for(i = 0; i < COUNT; i++) {
sync();
}
}
static void test_fs(void)
{
int i, ds;
asm volatile("mov %%ds,%0" : "=r" (ds));
for(i = 0; i < COUNT; i++) {
asm volatile("push %%fs; mov %0, %%fs; popl %%fs"
: : "r" (ds));
sync();
}
}
static void test_gs(void)
{
int i, ds;
asm volatile("mov %%ds,%0" : "=r" (ds));
for(i = 0; i < COUNT; i++) {
asm volatile("push %%gs; mov %0, %%gs; popl %%gs"
: : "r" (ds));
sync();
}
}
typedef void (*test_t)(void);
static test_t tests[] = {
test_none,
test_fs,
test_gs,
NULL,
};
int main()
{
int i;
int ds, fs, gs;
asm volatile("mov %%ds, %0; "
"mov %%fs, %1; "
"mov %%gs, %2"
: "=r" (ds), "=r" (fs), "=r" (gs) : : "memory");
printf("fs=%x gs=%x\n", fs, gs);
for(i = 0; tests[i]; i++) {
struct timespec start, end;
unsigned long long delta;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start);
(*tests[i])();
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);
delta = (end.tv_sec * 1000000000ull + end.tv_nsec) -
(start.tv_sec * 1000000000ull + start.tv_nsec);
delta /= COUNT;
printf("%lluns/iteration\n", delta);
}
return 0;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Wed Sep 13, 2006 12:10 pm Post subject: Re: i386 PDA patches use of %gs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Jeremy Fitzhardinge <jeremy.TakeThisOut@goop.org> wrote:
> [...] The basic inner loop is:
>
> push %segreg
> mov %selectorreg, %segreg
> add $1,%segreg:offset # use the segment register
> pop %segreg
well, the most important thing i believe you didnt test: the effect of
mixing two descriptors on the _same_ selector: one %gs selector value
loaded and used by glibc, and another %gs selector value loaded and used
by the kernel, intermixed. It's the mixing that causes the descriptor
cache reload. (unless i missed some detail about your testcase)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Sep 13, 2006 6:20 pm Post subject: Re: i386 PDA patches use of %gs [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Ingo Molnar wrote:
> well, the most important thing i believe you didnt test: the effect of
> mixing two descriptors on the _same_ selector: one %gs selector value
> loaded and used by glibc, and another %gs selector value loaded and used
> by the kernel, intermixed. It's the mixing that causes the descriptor
> cache reload. (unless i missed some detail about your testcase)
But it doesn't mix different descriptors on the same selector; the GDT
is initialized when the CPU is brought up, and is unchanged from then
on. The PDA descriptor is GDT entry 27 and the userspace TLS entries
are 6-8, so in the typical case %gs will alternate between 0x33 and 0xd8
as it enters and leaves the kernel.
My test program does the same thing, except using GDT entries 6 and 7
(selectors 0x33 and 0x3b).
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Andi Kleen External

Since: Jul 07, 2006 Posts: 1925
|
Posted: Wed Nov 15, 2006 12:40 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wednesday 15 November 2006 12:27, Eric Dumazet wrote:
> Seeing %gs prefixes used now by i386 port, I recalled seeing strange oprofile
> results on Opteron machines.
>
> I really think %gs prefixes can be expensive in some (most ?) cases, even if
> the Intel/AMD docs say they are free.
They aren't free, just very cheap.
>
> With the attached patch, I got 12.212 s, and a kernel text size reduction of
> 3400 bytes.
Are the benchmark numbers stable? i.e. if you repeat them multiple times
with reboots do you still get the same difference?
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Wed Nov 15, 2006 6:30 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Andi Kleen <ak.TakeThisOut@suse.de> wrote:
> On Wednesday 15 November 2006 12:27, Eric Dumazet wrote:
> > Seeing %gs prefixes used now by i386 port, I recalled seeing strange
> > oprofile results on Opteron machines.
> >
> > I really think %gs prefixes can be expensive in some (most ?) cases,
> > even if the Intel/AMD docs say they are free.
>
> They aren't free, just very cheap.
Eric's test shows a 5% slowdown. That's far from cheap.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Andi Kleen External

Since: Jul 07, 2006 Posts: 1925
|
Posted: Wed Nov 15, 2006 6:30 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wednesday 15 November 2006 18:20, Ingo Molnar wrote:
>
> * Andi Kleen <ak.TakeThisOut@suse.de> wrote:
>
> > On Wednesday 15 November 2006 12:27, Eric Dumazet wrote:
> > > Seeing %gs prefixes used now by i386 port, I recalled seeing strange
> > > oprofile results on Opteron machines.
> > >
> > > I really think %gs prefixes can be expensive in some (most ?) cases,
> > > even if the Intel/AMD docs say they are free.
> >
> > They aren't free, just very cheap.
>
> Eric's test shows a 5% slowdown. That's far from cheap.
I have my doubts about the accuracy of his test results. That is why I asked
him to double check.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Nov 15, 2006 6:40 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Ingo Molnar wrote:
> Eric's test shows a 5% slowdown. That's far from cheap.
>
It seems like an absurdly large difference. PDA references aren't all
that common in the kernel; for the %gs prefix on PDA accesses to be
causing a 5% overall difference in a test like this means that the
prefixes would have to be costing hundreds or thousands of cycles, which
seems absurd. Particularly since Eric's patch doesn't touch head.S, so
the %gs save/restore is still being executed.
Are we sure this isn't a cache layout issue? Eric, did you try evicting
your executable from pagecache between runs to see if you get variation
depending on what physical pages it gets put into? (Making several
copies of the executable should have the same effect.)
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Wed Nov 15, 2006 6:40 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Jeremy Fitzhardinge <jeremy.RemoveThis@goop.org> wrote:
> > Eric's test shows a 5% slowdown. That's far from cheap.
>
> It seems like an absurdly large difference. PDA references aren't all
> that common in the kernel; for the %gs prefix on PDA accesses to be
> causing a 5% overall difference in a test like this means that the
> prefixes would have to be costing hundreds or thousands of cycles,
> which seems absurd. Particularly since Eric's patch doesn't touch
> head.S, so the %gs save/restore is still being executed.
i said this before: using segmentation tricks these days is /insane/.
Segmentation is not for free, and it's not going to be cheap in the
future. In fact, chances are that it will be /more/ expensive in the
future, because sane OSs just make no use of them besides the trivial
"they dont even exist" uses.
so /at a minimum/, as i suggested it before, the kernel's segment use
should not overlap that of glibc's. I.e. the kernel should use %fs, not
%gs.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.RemoveThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Nov 15, 2006 7:00 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Eric Dumazet wrote:
> I wish Jeremy give us patches for UP machines so that %gs can be let untouched
> in entry.S (syscall entry/exit). A lot of ia32 machines are still using one
> CPU.
>
Unfortunately that would add cruft in a number of places. At the
moment, context switch, ptrace and vm86 all assume entry.S has saved %gs
into pt_regs, so they can treat it like any other register. If this
were conditional, it would require multiple places to add #ifndef
CONFIG_SMP code, which is not something I'd like to do without a good
reason.
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Wed Nov 15, 2006 7:00 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Eric Dumazet <dada1 RemoveThis @cosmosbay.com> wrote:
> Machine boots but freeze when init starts. Any idea ?
probably caused by this:
> +# define GET_CPU_NUM(reg)
> #define FIXUP_ESPFIX_STACK \
> /* since we are on a wrong stack, we cant make it a C code */ \
> - movl %gs:PDA_cpu, %ebx; \
> + GET_CPU_NUM(%ebx) \
> PER_CPU(cpu_gdt_descr, %ebx); \
> movl GDS_address(%ebx), %ebx; \
%ebx very definitely wants to have a current CPU number loaded Pick
it up from the task struct.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Eric Dumazet External

Since: May 15, 2006 Posts: 230
|
Posted: Wed Nov 15, 2006 7:00 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wednesday 15 November 2006 18:49, Ingo Molnar wrote:
> * Eric Dumazet <dada1.TakeThisOut@cosmosbay.com> wrote:
> > Machine boots but freeze when init starts. Any idea ?
>
> probably caused by this:
> > +# define GET_CPU_NUM(reg)
> >
> > #define FIXUP_ESPFIX_STACK \
> > /* since we are on a wrong stack, we cant make it a C code */ \
> > - movl %gs:PDA_cpu, %ebx; \
> > + GET_CPU_NUM(%ebx) \
> > PER_CPU(cpu_gdt_descr, %ebx); \
> > movl GDS_address(%ebx), %ebx; \
>
> %ebx very definitely wants to have a current CPU number loaded Pick
> it up from the task struct.
Hum.... Are you sure ?
For UP we have this PER_CPU definition :
#define PER_CPU(var, cpu) \
movl $per_cpu__/**/var, cpu;
You can see 'cpu' is a pure output , not an input value.
So I basically deleted the fist instruction of this sequence :
movl %gs:PDA_cpu, %ebx
movl $per_cpu__cpu_gdt_descr, %ebx;
Did I miss something ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Nov 15, 2006 7:00 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Ingo Molnar wrote:
> i said this before: using segmentation tricks these days is /insane/.
> Segmentation is not for free, and it's not going to be cheap in the
> future. In fact, chances are that it will be /more/ expensive in the
> future, because sane OSs just make no use of them besides the trivial
> "they dont even exist" uses.
>
Many, many systems use %fs/%gs to implement some kind of thread-local
storage, and such usage is becoming more common; the PDA's use of it in
the kernel is no different. I would agree that using all the obscure
corners of segmentation is just asking for trouble, but using %gs as an
address offset seems like something that's going to be efficient on x86
32/64 processors indefinitely.
> so /at a minimum/, as i suggested it before, the kernel's segment use
> should not overlap that of glibc's. I.e. the kernel should use %fs, not
> %gs.
Last time you raised this I did a pretty comprehensive set of tests
which showed there was flat out zero difference between using %fs and
%gs. There doesn't seem to be anything to the theory that reloading a
null segment selector is in any way cheaper than loading a real
selector. Did you find a problem in my methodology?
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Eric Dumazet External

Since: May 15, 2006 Posts: 230
|
Posted: Wed Nov 15, 2006 7:10 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wednesday 15 November 2006 18:59, Jeremy Fitzhardinge wrote:
> Ingo Molnar wrote:
> > i said this before: using segmentation tricks these days is /insane/.
> > Segmentation is not for free, and it's not going to be cheap in the
> > future. In fact, chances are that it will be /more/ expensive in the
> > future, because sane OSs just make no use of them besides the trivial
> > "they dont even exist" uses.
>
> Many, many systems use %fs/%gs to implement some kind of thread-local
> storage, and such usage is becoming more common; the PDA's use of it in
> the kernel is no different. I would agree that using all the obscure
> corners of segmentation is just asking for trouble, but using %gs as an
> address offset seems like something that's going to be efficient on x86
> 32/64 processors indefinitely.
>
> > so /at a minimum/, as i suggested it before, the kernel's segment use
> > should not overlap that of glibc's. I.e. the kernel should use %fs, not
> > %gs.
>
> Last time you raised this I did a pretty comprehensive set of tests
> which showed there was flat out zero difference between using %fs and
> %gs. There doesn't seem to be anything to the theory that reloading a
> null segment selector is in any way cheaper than loading a real
> selector. Did you find a problem in my methodology?
I have the feeling (most probably wrong, but I prefer to speak than keeping
this for myself) that the cost of segment load is delayed up to the first use
of a segment selector. Sort of a lazy reload...
I had this crazy idea while looking at oprofile numbers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Ingo Molnar External

Since: May 15, 2006 Posts: 3111
|
Posted: Wed Nov 15, 2006 7:10 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
* Eric Dumazet <dada1.TakeThisOut@cosmosbay.com> wrote:
> > > + GET_CPU_NUM(%ebx) \
> > > PER_CPU(cpu_gdt_descr, %ebx); \
> > > movl GDS_address(%ebx), %ebx; \
> >
> > %ebx very definitely wants to have a current CPU number loaded Pick
> > it up from the task struct.
>
> Hum.... Are you sure ?
>
> For UP we have this PER_CPU definition :
>
> #define PER_CPU(var, cpu) \
> movl $per_cpu__/**/var, cpu;
hm, you are right. No quick ideas then.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.TakeThisOut@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Arjan van de Ven External

Since: May 15, 2006 Posts: 901
|
Posted: Wed Nov 15, 2006 7:10 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
On Wed, 2006-11-15 at 09:28 -0800, Jeremy Fitzhardinge wrote:
> Ingo Molnar wrote:
> > Eric's test shows a 5% slowdown. That's far from cheap.
> >
>
> It seems like an absurdly large difference. PDA references aren't all
> that common in the kernel; for the %gs prefix on PDA accesses to be
> causing a 5% overall difference in a test like this means that the
> prefixes would have to be costing hundreds or thousands of cycles, which
> seems absurd. Particularly since Eric's patch doesn't touch head.S, so
> the %gs save/restore is still being executed.
segment register accesses really are not cheap.
Also really it'll be better to use the register userspace is not using,
but we had that discussion before; could you remind me why you picked
%gs in the first place?
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo DeleteThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Nov 15, 2006 7:30 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Arjan van de Ven wrote:
> segment register accesses really are not cheap.
> Also really it'll be better to use the register userspace is not using,
> but we had that discussion before; could you remind me why you picked
> %gs in the first place?
>
To leave open the possibility of using the compiler's TLS support in the
kernel for percpu. I also measured the cost of reloading %gs vs %fs,
and found no difference between reloading a null selector vs a non-null
selector.
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo RemoveThis @vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
Jeremy Fitzhardinge External

Since: May 30, 2006 Posts: 1261
|
Posted: Wed Nov 15, 2006 7:30 pm Post subject: Re: [PATCH] i386-pda UP optimization [Login to view extended thread Info.] Archived from groups: per prev. post (more info?) |
|
|
Eric Dumazet wrote:
> I have the feeling (most probably wrong, but I prefer to speak than keeping
> this for myself) that the cost of segment load is delayed up to the first use
> of a segment selector. Sort of a lazy reload...
>
Probably not too much, since the load itself has to raise a fault if
there's any problem with the segment itself, and once it is loaded you
can change the underlying descriptor without affecting the segment
register. Even if it were lazy, that would only make the first %gs use a
bit slow, and shouldn't affect the subsequent ones. However, when I
measured segment register use timings, I didn't see any dramatic costs
associated with segment register use which would account for a 5% hit in
your benchmark.
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo.DeleteThis@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/ |
|
| Back to top |
|
 |
|
|
|
You can post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
| |
|
|