mozilla::safebrowsing::LookupCacheV4::VerifyChecksum consumes >400ms of CPU during startup
Categories
(Toolkit :: Safe Browsing, enhancement, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox67 | --- | fixed |
People
(Reporter: gps, Assigned: dimi)
References
(Blocks 1 open bug)
Details
(Keywords: perf, perf:pageload)
Attachments
(7 files)
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review | |
|
47 bytes,
text/x-phabricator-request
|
Details | Review |
| Assignee | ||
Updated•8 years ago
|
Updated•8 years ago
|
| Assignee | ||
Updated•8 years ago
|
| Assignee | ||
Comment 1•8 years ago
|
||
Comment 2•8 years ago
|
||
Updated•8 years ago
|
Updated•8 years ago
|
| Reporter | ||
Comment 3•8 years ago
|
||
| Assignee | ||
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
Comment 6•8 years ago
|
||
Comment 7•8 years ago
|
||
| Assignee | ||
Comment 8•8 years ago
|
||
Updated•8 years ago
|
| Assignee | ||
Comment 9•8 years ago
|
||
Updated•8 years ago
|
Comment 10•7 years ago
|
||
Comment 11•6 years ago
|
||
Another current profile:
https://perfht.ml/2DK57OC
400+ms on a local mozilla-inbound opt --enable-release build
Comment 12•6 years ago
|
||
Note that from IRC conversations, I believe verifying the safebrowsing database blocks actually browsing, so this is important - and adds a bunch of CPU and likely time to startup. 400ms on a 3.8GHz xeon is a lot on something less powerful, like arm64.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
Comment 13•6 years ago
|
||
On the 2018 reference device, I captured a profile where we're spending about 8 seconds opening the safebrowsing database:
Are we calculating the checksum in this case? Is the work going on in this bug likely to help the issue in this profile?
| Assignee | ||
Comment 14•6 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #13)
On the 2018 reference device, I captured a profile where we're spending about 8 seconds opening the safebrowsing database:
Are we calculating the checksum in this case? Is the work going on in this bug likely to help the issue in this profile?
Yes, this is the case. The fix I am going to work here should help :)
Comment 15•6 years ago
|
||
Note that in that profile, only ~25% is verifying the hash - so even making that 0-time will only cut ~2 seconds from the startup, or less. See http://bit.ly/2E3vvU3
26% in nsresult mozilla::safebrowsing::LookupCacheV4::VerifyChecksum, 22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)
A fair bit of those times seem to be in opening files
| Assignee | ||
Comment 16•6 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #15)
Note that in that profile, only ~25% is verifying the hash - so even making that 0-time will only cut ~2 seconds from the startup, or less. See http://bit.ly/2E3vvU3
26% in nsresult mozilla::safebrowsing::LookupCacheV4::VerifyChecksum, 22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)
A fair bit of those times seem to be in opening files
The LoadMetadata should be removed during startup within the same patch I am working on.
The VariableLengthPrefixSet::LoadFromFile and nsUrlClassifierPrefixSet::LoadFromFile is required.
As for HashStore::Open, I may spend some time to see if this can be skipped after the work here is done.
| Assignee | ||
Comment 17•6 years ago
|
||
SafeBrowsing V4 protocol use SHA-256 as the checksum to check integrity
of update data and also the integrity of prefix files.
SafeBrowsing V2 HashStore use MD5 as the checksum to check integrity of
.sbstore
Since we are going to use CRC32 as the integrity check of V4 prefix files,
I think rename V4 "checksum" to SHA256 can improve readability.
| Assignee | ||
Comment 18•6 years ago
|
||
SHA256 is an expensive operation, we should avoid using them if
possible. SafeBrowsing prefix files are loaded during startup and
verify integrity with SHA256 which may affect the performance
especially on the low-end device.
This patch simply removes the SHA256 integrity check. CRC32 version
integrity check will be introduced in the other patch.
This patch also changes the behavior of recording
"Telemetry::URLCLASSIFIER_VLPS_LOAD_CORRUPT" a little bit.
It used to records only once per session(during startup, the first
time we load prefix set), now it records per update.
| Assignee | ||
Comment 19•6 years ago
|
||
SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.
This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.
| Assignee | ||
Comment 20•6 years ago
|
||
After this patch, we may have the following files in SafeBrowsing
directory:
- (v2) .sbstore : Store V2 chunkdata, for update, MD5 integrity check
while load - (v2) .pset : Store V2 prefixset, for lookup, load upon startup, no
integrity check - (v4) .metadata : Store V4 state, for update, no integrity check
- (v4) .vlpset : Store V4 prefixset, for lookup, load upon startup,
CRC32 integrity check - (v4) .pset : V4 prefix set before this patch, should be removed
The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.
| Assignee | ||
Comment 21•6 years ago
|
||
This patch is to cleanup old SafeBrowsing v4 prefix files.
| Assignee | ||
Comment 22•6 years ago
|
||
To avoid forcing a redownload of SafeBrowsing v4 list.
Comment 23•6 years ago
|
||
Comment 24•6 years ago
|
||
Backed out 6 changesets (bug 1353956) for Linux Build bustage
Log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=232294564&repo=autoland&lineNumber=35048
Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=71dafccc22ae493b656d9130cec82ba37d4babf7
Backout:
https://hg.mozilla.org/integration/autoland/rev/4f6e4cbf401f4e32bfa4431b6ee2e003f8faa832
| Assignee | ||
Comment 25•6 years ago
|
||
Comment 26•6 years ago
|
||
Comment 27•6 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/c2331373e107
https://hg.mozilla.org/mozilla-central/rev/c7a253aed450
https://hg.mozilla.org/mozilla-central/rev/e083106dc24f
https://hg.mozilla.org/mozilla-central/rev/3b5da75b9c7b
https://hg.mozilla.org/mozilla-central/rev/aaba7c25b72b
https://hg.mozilla.org/mozilla-central/rev/a87bd3f9b87d
https://hg.mozilla.org/mozilla-central/rev/9bac094cec23
| Assignee | ||
Updated•6 years ago
|
Comment 28•6 years ago
|
||
Thanks Dimi!
What remains to be done from comment 15:
"22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)
A fair bit of those times seem to be in opening files"
mconley - now that this landed, can you re-profile? thanks!
| Assignee | ||
Comment 29•6 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #28)
What remains to be done from comment 15:
"22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)
This patch should also remove the "LoadMetadata", so the remains are:
- VariableLengthPrefixSet::LoadFromFile
- nsUrlClassifierPrefixSet::LoadFromFile
- mozilla::safebrowsing::HashStore::Open
(1) and (2) are necessary otherwise users will be unprotected.
I filed bug 1531354 to check if we can skip reading the HashStore during startup. I will check this next week.
Comment 30•6 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #28)
mconley - now that this landed, can you re-profile? thanks!
Done: http://bit.ly/2TVnOtE
We're still taking a very long time getting that database off of the disk...
Comment 31•6 years ago
|
||
~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times). Split between:
- mozilla::safebrowsing::Classifier::GetLookupCache
- mozilla::safebrowsing::HashStore::Open
Then 900ms in GetTables()->TableRequest()->mozilla::safebrowsing::HashStore::Open()
Then 700ms in mozilla::safebrowsing::Classifier::LoadMetadata()
a few seconds later there's ~200ms in mozilla::safebrowsing::Classifier::AsyncApplyUpdates
Comment 32•6 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)
~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times).
This is not CPU time, 98% of these samples are the thread being blocked in nsLocalFile::OpenNSPRFileDesc.
Comment 33•6 years ago
|
||
~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times).
This is not CPU time, 98% of these samples are the thread being blocked in nsLocalFile::OpenNSPRFileDesc.
Yup - though not one call to open file from the stacks - this calls GetLookupCache()->LookupCacheV2::Open()->HashStore::Open() and directly calls HashStore::Open(), and cycles back and forth between them repeatedly (I presume for N different sets of files of safebrowsing data).
Some way to avoid opening N files and instead open 1 or two files would be a Big Win here I suspect. It may complicate updates.
| Assignee | ||
Comment 34•6 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)
~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times). Split between:
- mozilla::safebrowsing::Classifier::GetLookupCache
- mozilla::safebrowsing::HashStore::Open
Then 900ms in GetTables()->TableRequest()->mozilla::safebrowsing::HashStore::Open()
Then 700ms in mozilla::safebrowsing::Classifier::LoadMetadata()a few seconds later there's ~200ms in mozilla::safebrowsing::Classifier::AsyncApplyUpdates
I think some of those file read can be skipped, I'll work on it in Bug 1531354
Updated•3 years ago
|
Description
•