Closed
Bug 418866
Opened 17 years ago
Closed 15 years ago
fix profile-guided optimization (pgo) on linux
Categories
(Firefox Build System :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ted, Assigned: taras.mozilla)
References
Details
(Whiteboard: [not needed for 1.9.0][ts])
Attachments
(5 files, 3 obsolete files)
2.25 KB,
patch
|
coop
:
review+
beltzner
:
approval1.9b4-
ted
:
approval1.9+
|
Details | Diff | Splinter Review |
544 bytes,
patch
|
benjamin
:
review+
ted
:
approval1.9+
|
Details | Diff | Splinter Review |
1.18 KB,
text/plain
|
Details | |
2.11 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
1.76 KB,
patch
|
Details | Diff | Splinter Review |
Patch in a bit.
Flags: blocking1.9?
Updated•17 years ago
|
Reporter | ||
Comment 1•17 years ago
|
||
This was actually ok before bug 361343. There was another older bug where I made the existing PGO support work again.
No longer depends on: 361343
Reporter | ||
Comment 2•17 years ago
|
||
There you go.
Assignee: nobody → ted.mielczarek
Status: NEW → ASSIGNED
Updated•17 years ago
|
Attachment #304838 -
Flags: review+
Reporter | ||
Comment 3•17 years ago
|
||
So, this is going to be an unpopular change, I think we need to make it clobber as well.
Attachment #304838 -
Attachment is obsolete: true
Attachment #304854 -
Flags: review?
Reporter | ||
Updated•17 years ago
|
Attachment #304854 -
Flags: review? → review?(ccooper)
Comment 4•17 years ago
|
||
From offline discussions with Damon, RobSayre, and then Ted:
1) This is being proposed for nightly/clobber builds, as well as hourly/incremental builds as well as release builds.
2) There are similar changes required for Mac. Ted is filing a bug for that.
3) This double-compile process will take extra time to cycle a build; seems like linux would be 2x, win32 would be just under 2x. This will need to be communicated to everyone watching Tinderbox.
4) We are not blocked on bug#418896 or bug#418894.
Updated•17 years ago
|
Attachment #304854 -
Flags: review?(ccooper) → review+
Comment 6•17 years ago
|
||
I had to back this out because of the following error:
/builds/tinderbox/Fx-Trunk/Linux_2.6.18-8.el5_Depend/mozilla/memory/jemalloc/jemalloc.c: In function 'choose_arena':
/builds/tinderbox/Fx-Trunk/Linux_2.6.18-8.el5_Depend/mozilla/memory/jemalloc/jemalloc.c:6043: error: corrupted profile info: edge from 5 to 6 exceeds maximal count
gmake[4]: *** [jemalloc.o] Error 1
Comment 7•17 years ago
|
||
Looks like this might be just jemalloc causing problems. We should be able to skip PGO for that.
Comment 9•17 years ago
|
||
Comment on attachment 305385 [details] [diff] [review]
skip PGO in jemalloc [checked in]
File a followup? I imagine PGO for jemalloc would useful if simply for branch-prediction/locality optimizations.
Attachment #305385 -
Flags: review?(benjamin) → review+
Reporter | ||
Comment 10•17 years ago
|
||
Comment on attachment 305385 [details] [diff] [review]
skip PGO in jemalloc [checked in]
Filed bug 419470, will mention it in a comment.
Attachment #305385 -
Flags: approval1.9?
Comment 11•17 years ago
|
||
Comment on attachment 305385 [details] [diff] [review]
skip PGO in jemalloc [checked in]
a1.9+=damons
Attachment #305385 -
Flags: approval1.9? → approval1.9+
Reporter | ||
Comment 12•17 years ago
|
||
Comment on attachment 305385 [details] [diff] [review]
skip PGO in jemalloc [checked in]
sayrer: you should be clear to try this again, but given that we're freezing tomorrow night, it might be rough to do it before wednesday.
Attachment #305385 -
Attachment description: skip PGO in jemalloc → skip PGO in jemalloc [checked in]
Reporter | ||
Updated•17 years ago
|
Attachment #304854 -
Flags: approval1.9b4?
Updated•17 years ago
|
Flags: tracking1.9? → blocking1.9?
Comment 13•17 years ago
|
||
Do you guys need a clean window to do this, like with Windows?
Comment 14•17 years ago
|
||
(In reply to comment #13)
> Do you guys need a clean window to do this, like with Windows?
That would be best, but I don't think it's critical we do it this week. Our best avenue for beta linux users seems to be distro betas. So I'm content to focus on mac and additional Windows profile input for the rest of this week.
Updated•17 years ago
|
Flags: blocking1.9? → blocking1.9+
Priority: P1 → P2
Comment 15•17 years ago
|
||
Comment on attachment 304854 [details] [diff] [review]
with clobbery
a=beltzner for after beta 4
Attachment #304854 -
Flags: approval1.9b4?
Attachment #304854 -
Flags: approval1.9b4-
Attachment #304854 -
Flags: approval1.9+
Comment 17•17 years ago
|
||
(In reply to comment #15)
> (From update of attachment 304854 [details] [diff] [review])
> a=beltzner for after beta 4
Therefore, removing as blocker on bug#418926 (3.0beta4 tracking bug)
No longer blocks: 418926
Reporter | ||
Comment 18•17 years ago
|
||
Rob: do you want to do this? This has been sort of hanging out here for a while.
Updated•17 years ago
|
Assignee: ted.mielczarek → nobody
Status: ASSIGNED → NEW
Component: Build & Release → Build Config
Flags: blocking1.9+
Product: mozilla.org → Firefox
QA Contact: build → build.config
Version: other → unspecified
Updated•17 years ago
|
Assignee: nobody → ted.mielczarek
Comment 19•17 years ago
|
||
Given limited time in the release we are punting pgo on mac and linux - let's pick this up in next release
Reporter | ||
Updated•17 years ago
|
Assignee: ted.mielczarek → nobody
Priority: P2 → --
Reporter | ||
Comment 20•17 years ago
|
||
Comment on attachment 304854 [details] [diff] [review]
with clobbery
Clearing approval flag to get this off the radar.
Attachment #304854 -
Flags: approval1.9+
Reporter | ||
Comment 21•17 years ago
|
||
Comment on attachment 305385 [details] [diff] [review]
skip PGO in jemalloc [checked in]
Clearing approval flag to get this off the radar.
Attachment #305385 -
Flags: approval1.9+
Reporter | ||
Updated•17 years ago
|
Attachment #304854 -
Flags: approval1.9+
Reporter | ||
Updated•17 years ago
|
Attachment #305385 -
Flags: approval1.9+
Updated•17 years ago
|
Whiteboard: [missed 1.9 checkin]
Reporter | ||
Comment 22•17 years ago
|
||
We may take this in 3.next.
Whiteboard: [missed 1.9 checkin] → [not needed for 1.9]
Updated•16 years ago
|
Flags: blocking-firefox3.1?
Reporter | ||
Comment 23•16 years ago
|
||
I wouldn't block on it, though.
Flags: blocking-firefox3.1? → blocking-firefox3.1-
Updated•16 years ago
|
Flags: wanted-firefox3.1?
![]() |
||
Comment 24•16 years ago
|
||
So I just tried compiling with pgo on Linux. I get:
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o): In function `global constructors keyed to 65535_0_nsMorkReader.cpp':
/home/bzbarsky/mozilla/debug/mozilla/db/morkreader/nsMorkReader.cpp:579: undefined reference to `__gcov_init'
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o):(.data.rel+0x24): undefined reference to `__gcov_merge_add'
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o):(.data.rel+0x30): undefined reference to `__gcov_merge_single'
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o):(.data.rel+0x3c): undefined reference to `__gcov_merge_single'
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o):(.data.rel+0x48): undefined reference to `__gcov_merge_add'
../../../../db/morkreader/libmorkreader_s.a(nsMorkReader.o):(.data.rel+0x54): undefined reference to `__gcov_merge_ior'
/home/bzbarsky/mozilla/debug/obj-firefox-gpo/dist/lib/libunicharutil_s.a(nsUnicharUtils.o): In function `global constructors keyed to 65535_0_nsUnicharUtils.cpp':
/home/bzbarsky/mozilla/debug/obj-firefox-gpo/intl/unicharutil/util/internal/nsUnicharUtils.cpp:226: undefined reference to `__gcov_init'
/home/bzbarsky/mozilla/debug/obj-firefox-gpo/dist/lib/libunicharutil_s.a(nsUnicharUtils.o):(.data.rel+0x24): undefined reference to `__gcov_merge_add'
/home/bzbarsky/mozilla/debug/obj-firefox-gpo/dist/lib/libunicharutil_s.a(nsUnicharUtils.o):(.data.rel+0x30): undefined reference to `__gcov_merge_single'
collect2: ld returned 1 exit status
gmake[7]: *** [libplaces.so] Error 1
Comment 25•16 years ago
|
||
I think Murali was using gcov on Linux lately... maybe he can give some pointers?
Comment 26•16 years ago
|
||
Preliminary symptoms seems to be issues with LDFLAGS.
I need to check the .mozconfig file to comment further.
Updated•16 years ago
|
Summary: turn on profile-guided optimization on fx-linux-tbox → turn on profile-guided optimization on linux
Whiteboard: [not needed for 1.9] → [not needed for 1.9.0]
Comment 27•16 years ago
|
||
bz, please post your .mozconfig for Murali, as per comment #26.
![]() |
||
Comment 28•16 years ago
|
||
I had to do some manual copy/paste inclusion (I assumed you didn't want me to attach the 4-5 mozconfig files involved that include each other), and I removed all the comment lines. Other than that, this is the mozconfig involved.
Building on FC9 with:
% g++ --version
g++ (GCC) 4.3.0 20080428 (Red Hat 4.3.0-8)
% ld --version
GNU ld version 2.18.50.0.6-7.fc9 20080403
and ccache enabled.
Comment 29•16 years ago
|
||
Please add the following line to your .mozconfig file
export LDFLAGS="-lgcov -static-libgcc"
However, on a separate note, you are using '-g' flag for CFLAGS and CXXFLAGS and also using 'disable-debug'. Is it by design.
Thanks
Murali
Reporter | ||
Comment 30•16 years ago
|
||
Last time I tried this (with an older GCC, definitely not 4.3) it didn't require any special mozconfig hacking. It's possible gcc just sucks.
![]() |
||
Comment 31•16 years ago
|
||
> export LDFLAGS="-lgcov -static-libgcc"
I can try that, but sounds like https://developer.mozilla.org/en/Building_with_Profile-Guided_Optimization needs updating. Or the build target should perhaps handle this automatically.
AS for -g and --disable-debug, yes. I want to be able to have debug symbols in my optimized builds, both for crash stacks and actual debugging of issues that only show up in the optimized build.
Reed, why do you think gcov is the same as PGO? They're very different things, although I could understand if they shared some common infrastructure. Is there something I'm missing?
Reporter | ||
Comment 33•16 years ago
|
||
(In reply to comment #31)
> > export LDFLAGS="-lgcov -static-libgcc"
>
> I can try that, but sounds like
> https://developer.mozilla.org/en/Building_with_Profile-Guided_Optimization
> needs updating. Or the build target should perhaps handle this automatically.
The build system should handle this all automatically. It's not doing anything complicated, just -fprofile-generate / -fprofile-use:
http://mxr.mozilla.org/mozilla-central/source/configure.in#7030
Comment 34•16 years ago
|
||
(In reply to comment #32)
> Reed, why do you think gcov is the same as PGO? They're very different things,
> although I could understand if they shared some common infrastructure. Is
> there something I'm missing?
I only mentioned gcov because of what the errors in comment #24 said.
![]() |
||
Comment 35•16 years ago
|
||
Adding those LDFLAGS doesn't help; I still get the error from comment 24. The relevant command running at the time is:
c++ -fno-rtti -fno-exceptions -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-long-long -pedantic -g -fno-strict-aliasing -fshort-wchar -pthread -pipe -DNDEBUG -DTRIMMED -g -fno-inline -Os -freorder-blocks -fno-reorder-functions -fPIC -shared -Wl,-z,defs -Wl,-h,libplaces.so -o libplaces.so nsAnnoProtocolHandler.o nsAnnotationService.o nsFaviconService.o nsNavHistory.o nsNavHistoryExpire.o nsNavHistoryQuery.o nsNavHistoryResult.o nsNavBookmarks.o nsMaybeWeakPtr.o nsMorkHistoryImporter.o nsPlacesModule.o nsNavHistoryAutoComplete.o -lpthread -lgcov -static-libgcc -Wl,-rpath-link,/home/bzbarsky/mozilla/debug/obj-firefox-pgo/dist/bin -Wl,-rpath-link,/usr/local/lib ../../../../db/morkreader/libmorkreader_s.a /home/bzbarsky/mozilla/debug/obj-firefox-pgo/dist/lib/libunicharutil_s.a -L/home/bzbarsky/mozilla/debug/obj-firefox-pgo/dist/bin -lxpcom -lxpcom_core -L/home/bzbarsky/mozilla/debug/obj-firefox-pgo/dist/bin -L/home/bzbarsky/mozilla/debug/obj-firefox-pgo/dist/lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl -Wl,--version-script -Wl,/home/bzbarsky/mozilla/debug/mozilla/build/unix/gnu-ld-scripts/components-version-script -Wl,-Bsymbolic -lasound -ldl -lm
Updated•16 years ago
|
Flags: wanted-firefox3.6?
Flags: blocking-firefox3.6?
Comment 36•16 years ago
|
||
I tried Mozilla's PGO build and at one point or another, GCC would bail out complaining about corrupted profiling files, much like the issue that led to patch #305385. It turns out that GCC's profiling support was written without taking threaded applications into account, leading to inconsistencies in profiling information, which would then cause GCC to error out when reading them later.
In GCC-4.4, a new compiler flag was added to solve this issue:
"When using -fprofile-generate with a multi-threaded program, the profile counts may be slightly wrong due to race conditions. The new -fprofile-correction option directs the compiler to apply heuristics to smooth out the inconsistencies. By default the compiler will give an error message when it finds an inconsistent profile."
This patch adds a the -fprofile-correction flag to the PGO compiler flags and re-enables PGO for jemalloc. Naturally, this adds a dependency on GCC-4.4 for PGO.
(I was unable to reproduce bz's linking issue.)
Comment 37•16 years ago
|
||
Comment on attachment 372234 [details] [diff] [review]
Use -fprofile-correction and enable PGO for jemalloc
Several rebuilds later I ran into the jemalloc build error. So clearly there is more going on here than meets the eye.
Attachment #372234 -
Attachment is obsolete: true
Updated•16 years ago
|
Summary: turn on profile-guided optimization on linux → turn on profile-guided optimization (pgo) on linux
Comment 38•16 years ago
|
||
TEST-UNEXPECTED-FAIL | (automation.py) | Exited with code -11 during test run
INFO | (automation.py) | Application ran for: 0:00:00.465828
make: *** [profiledbuild] Error 245
Comment 39•16 years ago
|
||
I've got the same error on x86-64 with gcc4.4.0 no matter what mozconfig options I use.
Comment 40•16 years ago
|
||
I also get the TEST-UNEXPECTED-FAIL error when building with pgo in the moz-central branch as well as the 191 branch if pgo is on. The firefox profile-generation binary is segfaulting in __gcov_init -> strlen. I've tried the profile-correction patch as well as disabling jemalloc pgo with no success.
I get this on both 4.3 and 4.4, i haven't tested <=4.2
Comment 41•16 years ago
|
||
I ran into an identical gcov issue while trying to build Wine on Ubuntu 9.04: http://bugs.winehq.org/show_bug.cgi?id=18832
I'm beginning to think there's a deeper problem exposing this for both of us, like something within gcc and gcov
This thread (about another gcov problem): http://nic.uoregon.edu/pipermail/tau-users/2009-April/000245.html suggests that it might be a mismatch issue:
> I was able to reproduce this only by having a mismatch of GCC built
> libraries. When I used a GCC 3.4.6 built TAU/PDT, I got:
>
> % tau_cc.sh -fprofile-arcs -fprofile-generate foo.c
>
> ...
> : undefined reference to `__gcov_indirect_call_profiler'
>
> However, when I rebuilt TAU,PDT, and then used it, I had no problem.
>
> When I'd used all 3.4.6 built libraries, it all worked as well, only with
> the mismatch could I reproduce it.
>
Reporter | ||
Comment 42•16 years ago
|
||
It certainly sounds like a busted GCC or gcov or something, not at all a problem with our codebase.
Comment 43•16 years ago
|
||
TEST-UNEXPECTED-FAIL | (automation.py) | Exited with code -11 during test run
INFO | (automation.py) | Application ran for: 0:00:00.465828
make: *** [profiledbuild] Error 245
Seems to connected with pgo script. When I'm using simple shell script, that just start and close ff, it builds just fine.
Reporter | ||
Comment 44•16 years ago
|
||
The PGO script simply sets up the environment and launches the browser. The "code -11" there is the browser crashing.
Comment 45•16 years ago
|
||
Strange, as the same config give woring non-pgo build, also as I mentioned before, shell script produces working pgo build. So what's wrong? :)
Comment 46•16 years ago
|
||
There may be multiple issues at play, but when i was getting the code -11, running the profile-generate binary manually or through any script segfaulted. Non-pgo builds work fine for me, it's only the profile-generation binary that has issues. Skipping that and doing the final build without profile data negates the purpose of PGO.
Its possible your script is running the wrong binary, not stopping the build should it fail, or you're simply running into a different issue.
I've tried gcc 4.3 and 4.4, 4.4's profile-correction, enabling and disabling optimization, bare minimum mozconfigs, central and 191src. I used to run nightly builds with pgo on, and they just broke one day. I haven't gotten one to compile since.
Comment 47•16 years ago
|
||
Here's an issue I found from googling for gcov symbol errors:
http://stackoverflow.com/questions/566472/where-is-the-gcov-symbols
---
I just spent an incredible amount of time debugging a very similar error.
Here's what I learned:
* You have to pass -fprofile-arcs -ftest-coverage when compiling.
* You have to pass -fprofile-arcs when linking.
* You can still get weird linker errors when linking. They'll look like
this: libname.a(objfile.o):(.ctors+0x0): undefined reference to `global
constructors keyed to long_name_of_file_and_function'
This means that gconv's having problem with one of your compiler-generated
constructors (in my case, a copy-constructor). Check the function mentioned in
the error message, see what kinds of objects it copy-constructs, and see if any
of those classes doesn't have a copy constructor. Add one, and the error will
go away.
Edit: Whether or not you optimize can also affect this. Try turing on /
switching off optimizations if you're having problems with it.
---
This may be what's causing the problem in both Wine and Firefox.
Comment 48•16 years ago
|
||
fantastic
![]() |
||
Comment 49•16 years ago
|
||
For my error, maybe the issue is the nsTArray use on MorkColumn (which has no copy ctor)? It'd suck if gcov required explicit copy ctors on everything we use nsTArray on. :(
Comment 50•16 years ago
|
||
Try fixing just one of the copy constructors and see if gcc errors in a different place - if so, then we'll indeed need one for every function :(
Comment 51•16 years ago
|
||
Why am I reading MorkAnything in 2009?
/be
![]() |
||
Comment 52•16 years ago
|
||
Because you want to be able to migrate Firefox 2 user profiles?
Comment 53•16 years ago
|
||
MorkReader (not Mork itself) is still used for import from pre-Firefox 3.0 history.
Comment 54•16 years ago
|
||
(Personally, I don't mind if FF-after-3.5, or even FF 3.5.1 can't import history from FF 2.0.)
Comment 55•16 years ago
|
||
We should get that on file... I'm not opposed to killing migration-from-distant-past in .next, though 3.5.1 seems premature.
Comment 56•16 years ago
|
||
make -f client.mk profiledbuild not working at present
/media/sdb1/programs/mozilla/central/src/netwerk/protocol/http/src/nsHttpConnectionMgr.cpp:989: error: corrupted profile info: number of executions for edge 2-3 thought to be 30069
/media/sdb1/programs/mozilla/central/src/netwerk/protocol/http/src/nsHttpConnectionMgr.cpp:989: error: corrupted profile info: number of executions for edge 2-5 thought to be -1
nsHttpAuthManager.cpp
make[9]: *** [nsHttpConnectionMgr.o] Error 1
make -f client.mk build working
Last time I built successfully was about last Thursday or Friday (am at GMT+10) here
Reporter | ||
Comment 57•16 years ago
|
||
This looks like a GCC problem. I think given the sheer number of comments here, GCC's PGO is so buggy/fickle as to be unusable.
Comment 58•16 years ago
|
||
And today
make -f client.mk profiledbuild
is working
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090723 Minefield/3.6a1pre ID:20090723105705
It did complain and stop in the configure section about libiw-dev being missing (ubuntu here and apt-get fixed that) but I don't know if that is related to the previous failure.
Comment 59•16 years ago
|
||
Profiledbuild died again today 1245pm Monday 3 Aug 2009 Aust Eastern Std Time (GMT +10). Don't know when the change to make it fail happened. Sometime between now and last Thursday.
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsEventQueue.cpp: In member function ‘PRBool nsEventQueue::PutEvent(nsIRunnable*)’:
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsEventQueue.cpp:141: error: corrupted profile info: number of executions for edge 17-18 thought to be 3064
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsEventQueue.cpp:141: error: corrupted profile info: number of executions for edge 17-23 thought to be -1
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsEventQueue.cpp:141: error: corrupted profile info: number of executions for edge 23-24 thought to be 3064
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsEventQueue.cpp:141: error: corrupted profile info: number of executions for edge 23-26 thought to be -1
make[7]: *** [nsEventQueue.o] Error 1
make[7]: *** Waiting for unfinished jobs....
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsThreadPool.cpp: In member function ‘virtual nsresult nsThreadPool::Run()’:
/media/sdb1/programs/mozilla/central/src/xpcom/threads/nsThreadPool.cpp:163: warning: ‘idleSince’ may be used uninitialised in this function
make[7]: Leaving directory `/media/sdb1/programs/mozilla/obj-i686-pc-linux-gnu/ffpgo/browser/xpcom/threads'
make[6]: *** [libs] Error 2
make[6]: Leaving directory `/media/sdb1/programs/mozilla/obj-i686-pc-linux-gnu/ffpgo/browser/xpcom'
make[5]: *** [libs_tier_xpcom] Error 2
make[5]: Leaving directory `/media/sdb1/programs/mozilla/obj-i686-pc-linux-gnu/ffpgo/browser'
make[4]: *** [tier_xpcom] Error 2
make[4]: Leaving directory `/media/sdb1/programs/mozilla/obj-i686-pc-linux-gnu/ffpgo/browser'
make[3]: *** [default] Error 2
make[3]: Leaving directory `/media/sdb1/programs/mozilla/obj-i686-pc-linux-gnu/ffpgo/browser'
make[2]: *** [build] Error 2
make[2]: Leaving directory `/media/sdb1/programs/mozilla/central/src'
make[1]: *** [build] Error 2
make[1]: Leaving directory `/media/sdb1/programs/mozilla/central/src'
make: *** [profiledbuild] Error 2
ordinary build works.
Is there any chance of having an official pgo build attempted once per day, so that there will exist a body of data about how and when it fails?
Updated•16 years ago
|
Flags: wanted-firefox3.6?
Flags: wanted-firefox3.6-
Flags: blocking-firefox3.6?
Flags: blocking-firefox3.6-
Assignee | ||
Comment 60•15 years ago
|
||
(In reply to comment #57)
> This looks like a GCC problem. I think given the sheer number of comments here,
> GCC's PGO is so buggy/fickle as to be unusable.
That may be, but we have enough gcc-brains in mozilla now that we have little excuse to not try to deploy it. Here is why.
I am building with gcc 4.4.3 on fedora, didn't run into any issues mentioned in this bug(may be luck).
Here are some cold startup numbers. First is the name where ordered=(application of bug 549749), static=(bug 525013), and pgo=(pgo with debloating -freorder-blocks-and-partition). Second column is cold startup time. Third column is RSS(ie how much ram is paged in via mmap, in this experiment it basically demonstrates how horribly inefficiently our code is laid out right now).
firefox.stock: 2515ms 49452
firefox.ordered 1919ms 45344
firefox.static 2321ms 49616
firefox.static.ordered 1577ms 37072
firefox.static.pgo 1619ms 38436
The improvements in startup stem completely from improved seeking behavior. There is a 3x reduction in pagefaults between the slowest and fastest build.
The reason why pgo is fast(as far as I can tell it does not lay out the binary chronologically) is because -freorder-blocks-and-partition breaks up functions into their useful part(ie the part that executes) and a bloated part(ie error checking that isn't needed 99% of the time.
Unfortunately I haven't yet figured out how to chronologically order symbols in a pgo build, that would result in the fastest-feasible-firefox-binary.
Conclusion: PGO wins are massive and very real. Because of -freorder-blocks-and-partition pgo can effectively debloat our functions resulting in a nimbler-smaller firefox. This combined with likely more efficient-runtime performance means we should aggressively pursue PGO.
Aside: All this made me wonder if it's possible to build on top of -freorder-blocks-and-partition to output cold functions as thumb for mobile. From my limited understanding of arm stuff, that should give us a very nice footprint win.
Assignee | ||
Updated•15 years ago
|
Whiteboard: [not needed for 1.9.0] → [not needed for 1.9.0][ts]
Comment 62•15 years ago
|
||
FWIW, all I got from trying to build with PGO is a crash during libxul initialization when the profile is being run, i.e. after the first build, not the second one. And the crash has a very nice mozilla unrelated backtrace:
#0 0x00007ffff40807c1 in strlen () from /lib/libc.so.6
#1 0x00007ffff6823a92 in __gcov_init () from /tmp/xulrunner/dist/bin/libxul.so
#2 0x00007ffff6824f56 in __do_global_ctors_aux () from /tmp/xulrunner/dist/bin/libxul.so
#3 0x00007ffff51888ab in _init () from /tmp/xulrunner/dist/bin/libxul.so
#4 0x00007fffffffe908 in ?? ()
#5 0x00007ffff7dee429 in ?? () from /lib64/ld-linux-x86-64.so.2
#6 0x00007ffff7dee5af in ?? () from /lib64/ld-linux-x86-64.so.2
#7 0x00007ffff7de1b2a in ?? () from /lib64/ld-linux-x86-64.so.2
#8 0x0000000000000001 in ?? ()
#9 0x00007fffffffeb8c in ?? ()
#10 0x0000000000000000 in ?? ()
I haven't tried on x86, though, only on x86-64. But that means PGO already doesn't work everywhere.
I do think, however, that what could be more interesting than PGO (and would work better) is profile guided code improvement. The idea would be to hint the compiler that chunks of code are likely or unlikely to be run, depending on data gathered from the profile, using __builtin_expect() for gcc, for example. I think that would make at the very least 50% (if not near 100%) of what PGO does, without having to build twice, and without resorting on gcc working properly for PGO.
Comment 63•15 years ago
|
||
Confirmed, with the same gcc options, pgo works on x86 but not on x86-64.
Assignee | ||
Comment 64•15 years ago
|
||
(In reply to comment #63)
> Confirmed, with the same gcc options, pgo works on x86 but not on x86-64.
Which version of gcc? I can't seem to get anything other than 4.4 to build stuff.
Comment 65•15 years ago
|
||
I get basically the same crash after the first build on x86-64, with gcc 4.4.3.
Comment 66•15 years ago
|
||
> Which version of gcc? I can't seem to get anything other than 4.4 to build
> stuff.
4.4.3 from debian for both x86 and x86-64.
See Also: → https://launchpad.net/bugs/213708
Assignee | ||
Comment 67•15 years ago
|
||
Figured out the problem. profiledbuild doesn't work on GCC 4.3 because the build system occasionally drops the -fprofile-generate flags(ie while building parts of nss) which then confuses things at link time(newer compilers seem to be more lenient there).
Unfortunately even if one does pass the pgo options everywhere, the second build fails because gcc 4.3 does not have -Wcoverage-mismatch. Figuring out what compiler we should switch to now (4.5 or 4.4).
Comment 68•15 years ago
|
||
Taras, I am having difficulty reproducing your results. I am using a Slackware Linux user's build script to build firefox with PGO enabled:
http://www.linuxquestions.org/questions/linux-software-2/building-firefox-3-6-a-profile-guided-optimization-pgo-automated-build-script-784164/
I made some improvements to her script and contributed them back to the thread at linuxquestions.org, however, I am unable to reproduce any improvements in Sun Spider benchmarks when comparing her script's PGO-build to the system Firefox, even if both are patched to enable TraceMonkey support on 64-bit Linux.
I think my difficulty is because -freorder-blocks-and-partition is not being passed to GCC. What are you doing to pass -freorder-blocks-and-partition to GCC during the initial build?
Assignee | ||
Comment 69•15 years ago
|
||
(In reply to comment #68)
> I think my difficulty is because -freorder-blocks-and-partition is not being
> passed to GCC. What are you doing to pass -freorder-blocks-and-partition to GCC
> during the initial build?
in configure.in change
MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fno-reorder-functions -fomit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK" to
MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks-and-partition -fomit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
Those pgo instructions are a bit crazy. See https://developer.mozilla.org/en/Building_with_Profile-Guided_Optimization , all you have to do is make a script that can launch firefox. Add the PROFILE_GEN_SCRIPT bit to your .mozconfig and do make -f client.mk profiledbuild
Comment 70•15 years ago
|
||
Taras, thankyou for the information. I tried building firefox using mozilla's pgo instructions, but whenever I try building it, I get the following error message mid-way through the build:
OBJDIR=/home/richard/firefox-build/vanilla/mozilla-1.9.2/../firefox-build python /home/richard/firefox-build/vanilla/mozilla-1.9.2/../firefox-build/_profile/pgo/profileserver.py
INFO | automation.py | Application pid: 28455
TEST-UNEXPECTED-FAIL | automation.py | Exited with code -11 during test run
INFO | automation.py | Application ran for: 0:00:00.109925
make: *** [profiledbuild] Error 245
It has been like this whenever I would try doing PGO-builds over the past few years. Googling the error message gives me plenty of people with the same issue, but no solutions.
Assignee | ||
Comment 71•15 years ago
|
||
(In reply to comment #70)
> Taras, thankyou for the information. I tried building firefox using mozilla's
> pgo instructions, but whenever I try building it, I get the following error
> message mid-way through the build:
>
> OBJDIR=/home/richard/firefox-build/vanilla/mozilla-1.9.2/../firefox-build
> python
> /home/richard/firefox-build/vanilla/mozilla-1.9.2/../firefox-build/_profile/pgo/profileserver.py
> INFO | automation.py | Application pid: 28455
> TEST-UNEXPECTED-FAIL | automation.py | Exited with code -11 during test run
> INFO | automation.py | Application ran for: 0:00:00.109925
> make: *** [profiledbuild] Error 245
I am guessing that this this due to the same segfault as others have reported(on 64bit). This may be due to our buildsystem occasionally omitting pgo flags. The other approach to doing pgo is to build with
export PGO="-fprofile-generate=/tmp/pgo -fprofile-arc"
export CXX="g++ $PGO"
export CC="cc $PGO"
in your .mozconfig
then run firefox, then build it a second time with PGO=-fprofile-use=/tmp/pgo
It would be cool if someone could figure out how to build firefox with pgo on 64bit.
Comment 72•15 years ago
|
||
I don't see why the flags would be sometimes ommitted when building on x86-64 and not on x86. My guess is that the problem lies in gcc. Even building with CXX="g++ --coverage" and CC="gcc --coverage" results in the same crashes.
Assignee | ||
Comment 73•15 years ago
|
||
(In reply to comment #72)
> I don't see why the flags would be sometimes ommitted when building on x86-64
> and not on x86. My guess is that the problem lies in gcc. Even building with
> CXX="g++ --coverage" and CC="gcc --coverage" results in the same crashes.
They are occasionally omitted on all platforms. Thanks for the info,
Assignee | ||
Comment 74•15 years ago
|
||
files the x86_64 gcc bug as http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43825
Assignee | ||
Comment 75•15 years ago
|
||
this plus fix in bug 560897 = working pgo on x86_64(and likely gcc 4.5)
Comment 76•15 years ago
|
||
-fprofile-correction in optim flags doesn't sound right
Assignee | ||
Comment 77•15 years ago
|
||
(In reply to comment #76)
> -fprofile-correction in optim flags doesn't sound right
it's not something that would land, just enough to show that pgo works. Sticking those flags in right places is the right thing(tm).
Assignee | ||
Comment 78•15 years ago
|
||
Attachment #440627 -
Attachment is obsolete: true
Attachment #442123 -
Flags: review?(ted.mielczarek)
Comment 79•15 years ago
|
||
(In reply to comment #78)
> Created an attachment (id=442123) [details]
> configury
The patch is absolutely amazing. Works even with my complicated .mozconfig. Thank you very much! We all have been waiting for it.
Reporter | ||
Updated•15 years ago
|
Attachment #442123 -
Flags: review?(ted.mielczarek) → review+
Assignee | ||
Updated•15 years ago
|
Keywords: checkin-needed
Comment 80•15 years ago
|
||
Comment on attachment 442123 [details] [diff] [review]
configury
Pushed as http://hg.mozilla.org/mozilla-central/rev/7a6a20da79ae
Should this bug be closed or is actually enabling pgo build on build servers part of this bug ?
Updated•15 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 81•15 years ago
|
||
(In reply to comment #80)
> (From update of attachment 442123 [details] [diff] [review])
> Pushed as http://hg.mozilla.org/mozilla-central/rev/7a6a20da79ae
>
> Should this bug be closed or is actually enabling pgo build on build servers
> part of this bug ?
Thanks. We'll turn it on via bug 559964.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•15 years ago
|
Summary: turn on profile-guided optimization (pgo) on linux → fix profile-guided optimization (pgo) on linux
Comment 82•15 years ago
|
||
Hm, it fails to compile now. Regression range is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ae1773250d39&tochange=5b3604a3cfbe
gcc -o now.o -c -fomit-frame-pointer -Wall -pthread -march=core2 -O2 -pipe -fPIC -UDEBUG -DMOZILLA_CLIENT=1 -DNDEBUG=1 -DHAVE_VISIBILITY_HIDDEN_ATTRIBUTE=1 -DHAVE_VISIBILITY_PRAGMA=1 -DXP_UNIX=1 -D_GNU_SOURCE=1 -DHAVE_FCNTL_FILE_LOCKING=1 -DLINUX=1 -Di386=1 -DHAVE_LCHOWN=1 -DHAVE_STRERROR=1 -D_REENTRANT=1 -DFORCE_PR_LOG -D_PR_PTHREADS -UHAVE_CVAR_BUILT_ON_SEM -fprofile-use -fprofile-correction -Wcoverage-mismatch -freoder-blocks-and-partition /home/virus_found/abs/firefox/src/mozilla-central-build/nsprpub/config/now.c
cc1: error: unrecognized command line option "-freoder-blocks-and-partition"
make[5]: *** [now.o] Error 1
Assignee | ||
Comment 83•15 years ago
|
||
(In reply to comment #82)
> cc1: error: unrecognized command line option "-freoder-blocks-and-partition"
> make[5]: *** [now.o] Error 1
Sounds like you configured with one compiler and are building with an older one.
Comment 84•15 years ago
|
||
(In reply to comment #83)
> (In reply to comment #82)
>
> > cc1: error: unrecognized command line option "-freoder-blocks-and-partition"
> > make[5]: *** [now.o] Error 1
>
> Sounds like you configured with one compiler and are building with an older
> one.
I'm sorry, I don't get it. What do you mean by "configured"? I configure with autoconf-2.13.
Comment 85•15 years ago
|
||
(In reply to comment #83)
> (In reply to comment #82)
>
> > cc1: error: unrecognized command line option "-freoder-blocks-and-partition"
> > make[5]: *** [now.o] Error 1
>
> Sounds like you configured with one compiler and are building with an older
> one.
It just looks like there is a typo... reoder vs. reorder.
Comment 86•15 years ago
|
||
(In reply to comment #85)
> It just looks like there is a typo... reoder vs. reorder.
Yes, this turns out to be the reason. This was introduced in http://hg.mozilla.org/mozilla-central/rev/b5b016bb7c91
Comment 87•15 years ago
|
||
Comment 88•15 years ago
|
||
Comment on attachment 449604 [details] [diff] [review]
fixes two typos, introduced in b5b016bb7c91 commit
I'm not permitted to change anything in this bug, but adding comments and patches, so this needs to be checked in.
Attachment #449604 -
Flags: review?
Updated•15 years ago
|
Attachment #449604 -
Flags: review? → review?(ted.mielczarek)
Updated•15 years ago
|
Attachment #449604 -
Flags: review?(ted.mielczarek)
Reporter | ||
Comment 89•15 years ago
|
||
Comment on attachment 449604 [details] [diff] [review]
fixes two typos, introduced in b5b016bb7c91 commit
This is fallout from bug 564851. My patch there is correct, but somehow I screwed it up while landing, apparently. I checked in a fix in NSPR CVS.
Updated•6 years ago
|
Component: Build Config → General
Product: Firefox → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•