Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This sounds like it's just the cost to traverse the page table, right? ~300 cycles per raw memory lookup, and 3 of them because you'll typically need to go three levels deep?

The TLB is tiny these days, and 4kb pages are tiny.

I'm super hopeful that Linus is going to force through some big improvements to HugePages, because the current Linux HugePages support is super painful at the moment. 2MB pages alone could be a massive gain.



This

4k for a page is ridiculous. I'd say it was ridiculous for something like 5 years ago already

4M may be too big (I'm thinking 512k could be a sweet spot)

(or it could just work in chunks - I believe it does something like that already, and get multiple pages at once)


512K pagesize? Wouldn't that add a ton of IO in a lot of scenarios? Like every 1 byte file would now require a 512K event? Large pages (2MB/1GB) is for specialised use where you know you're not going to be paging things in/out too often, right?

IIRC Linus was quite dismissive about having larger-than-4k page sizes as the default.


as opposed as taking 256 pages by MB? And all the overhead of that

It's probably easier to make something special for small files than making the current system go faster


You don't need to couple memory pages with disk blocks.


Switch to PowerPC and get 64k pages by default in most distros!


.. and exposing lots of buggy userspace code into the bargain!


What kind of bugs appear with bigger default page sizes?


Lots of userspace makes assumptions about page size being 4k and breaks when it changes. Try looking for:

https://www.google.co.uk/search?q="pagesize"+"64k"+"bug"

https://www.google.co.uk/search?q="pagesize"+"64k"+"issue"

Another common one is actually in the kernel where filesystem block sizes are limited to page sizes, so from this point of view large page sizes are better:

https://lwn.net/Articles/591690/


And an even slower page fault mechanism. PPC generally ends up relying on hugepages to try to match pagefault speeds on x86


The thing is, all the relevant page table data should already be in L1 caches when the page fault handler returns. The TLB miss on iret should not require raw memory lookups and should be much faster than 300 cycles.

Probably it's more that iret simply has always been slow (hence why the various syscall/sysenter extensions were created).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: