summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Fix comparision error between pointer and NULLJan Niklas Hasse2024-02-291-1/+1
|
* Merge pull request #1949 from Flowdalic/load-capacityJan Niklas Hasse2024-02-293-17/+51
|\ | | | | Consider the remaining job capacity in main loop
| * Consider the remaining load capacity in main loopFlorian Schmaus2023-11-233-17/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes CanRunMore() to return an int instead of a bool. The return value is the "remaining load capacity. That is the number of new jobs that can be spawned without saturating a potential enabled load limitation (if ninja's -l option is used). We assume that every started edge increases the load by one. Hence the available "load capacity" is the maximum allowed load minus the current load. Previously, ninja would oversaturate the system with jobs, even though a load and job limit was provided, when multiple ninja builds are running. This is because changes in load average are inert, newly started dobs to no immediatly change the load average, yet ninja assumed that new jobs are immediately reflected in the load average. Ninja would retrieve the current 1min load average, check if it is below the limit and, if so, start a new job, and then repeat. Since it takes a while for the new job to get reflected in the load average, ninja would often spawn jobs until the job limit ("-j") is reached. If this is done by multiple parallel ninja builds, then the system becomes oversaturated, causing excessing context switches, which eventually slow down each and every build process. We can easily prevent this by considering the remaining load capacity in ninja's main loop. The following benchmark demonstrates how the change of this comit helps to speed up multiple parallel builds on the same host. We compare the total build times of 8 parallel builds of LLVM on a 256-core system using "ninja -j 258". ninja-master: 1351 seconds ninja-load-capacity: 920 seconds That is, with this commit, the whole process becomes 1.46× faster. The used benchmark script created and prepared 8 build directories, records the start time, spawns 8 subshells invoking "ninja -j 258", awaits the termination of those subshells, and records the end time. Besides the total running time, it also outputs /proc/loadavg, provides an indication of where the performance is gained: ninja-master: 3.90 93.94 146.38 1/1936 209125 ninja-load-capacity: 92.46 210.50 199.90 1/1936 36917 So with this change, ninja uses the available hardware cores better in the presence of competing ninja processes, while it does not overload the system. Finally, let us look at the two "dstat -cdgyl 60" traces of 8 parallel LLVM builds on a 256-core machine using "ninja -l 258": ninja-master --total-cpu-usage-- -dsk/total- ---paging-- ---system-- ---load-avg--- usr sys idl wai stl| read writ| in out | int csw | 1m 5m 15m 1 0 99 0 0| 12k 4759k| 5B 55B|1135 455 |17.9 70.3 38.1 38 6 56 0 0|2458B 7988k| 205B 0 | 34k 23k| 466 170 73.2 26 3 71 0 0| 102k 94M| 0 0 | 22k 6265 | 239 156 74.3 50 5 45 0 0|3149B 97M| 0 0 | 37k 12k| 257 191 92.2 58 6 36 0 0| 90k 71M| 0 0 | 43k 12k| 320 224 110 50 4 46 0 0| 52k 78M| 0 0 | 38k 6690 | 247 223 117 50 5 45 0 0| 202k 90M| 0 0 | 37k 9876 | 239 238 130 60 5 34 0 0| 109k 93M| 0 0 | 44k 8950 | 247 248 140 69 5 26 0 0|5939B 93M| 0 0 | 50k 11k| 309 268 154 49 4 47 0 0| 172k 111M| 0 0 | 36k 7835 | 283 267 161 58 7 35 0 0| 29k 142M| 0 0 | 45k 7666 | 261 267 168 72 4 24 0 0| 46k 281M| 0 0 | 50k 13k| 384 296 183 49 6 46 0 0| 68B 198M| 0 0 | 37k 6847 | 281 281 185 82 6 12 0 0| 0 97M| 0 0 | 59k 15k| 462 323 205 31 5 63 0 0| 0 301M| 0 0 | 26k 5350 | 251 291 202 66 7 28 0 0| 68B 254M| 0 0 | 49k 9091 | 270 292 208 68 8 25 0 0| 0 230M| 0 0 | 51k 8186 | 287 292 213 52 5 42 1 0| 0 407M| 0 0 | 42k 5619 | 207 271 211 29 7 64 0 0| 0 418M| 0 0 | 27k 2801 | 131 241 205 1 1 98 0 0| 137B 267M| 0 0 |1944 813 |55.8 199 193 0 0 100 0 0|2253B 43M| 0 0 | 582 365 |26.8 165 181 0 0 99 0 0| 0 68M| 0 0 | 706 414 |11.5 136 170 4 0 96 0 0| 0 13M| 0 0 |2892 378 |10.0 113 160 ninja-load-capacity --total-cpu-usage-- -dsk/total- ---paging-- ---system-- ---load-avg--- usr sys idl wai stl| read writ| in out | int csw | 1m 5m 15m 1 0 98 0 0| 12k 5079k| 5B 55B|1201 470 |1.35 40.2 115 43 6 51 0 0|3345B 78M| 0 0 | 34k 20k| 247 127 142 71 6 23 0 0| 0 59M| 0 0 | 53k 8485 | 286 159 152 60 5 35 0 0| 68B 118M| 0 0 | 45k 7125 | 277 178 158 62 4 35 0 0| 0 115M| 0 0 | 45k 6036 | 248 188 163 61 5 34 0 0| 0 96M| 0 0 | 44k 9448 | 284 212 173 66 5 28 0 0| 9B 94M| 0 0 | 49k 5733 | 266 219 178 64 7 29 0 0| 0 159M| 0 0 | 49k 6350 | 241 223 182 66 6 28 0 0| 0 240M| 0 0 | 50k 9325 | 285 241 191 68 4 27 0 0| 0 204M| 0 0 | 49k 5550 | 262 241 194 68 8 24 0 0| 0 161M| 0 0 | 53k 6368 | 255 244 198 79 7 14 0 0| 0 325M| 0 0 | 59k 5910 | 264 249 202 72 6 22 0 0| 0 367M| 0 0 | 54k 6684 | 253 249 205 71 6 22 1 0| 0 377M| 0 0 | 52k 8175 | 284 257 211 48 8 44 0 0| 0 417M| 0 0 | 40k 5878 | 223 247 210 23 4 73 0 0| 0 238M| 0 0 | 22k 1644 | 114 214 201 0 0 100 0 0| 0 264M| 0 0 |1016 813 |43.3 175 189 0 0 100 0 0| 0 95M| 0 0 | 670 480 |17.1 144 177 As one can see in the above dstat traces, ninja-master will have a high 1min load average, of up to 462. This is because ninja will not considered the remaining load capacity when spawning new jobs, but instead spawn as new jobs until it runs into the -j limitation. This, in turn, causes an increase of context switches: the rows with a high 1min load average also have >10k context switches (csw). Whereas a remaining load-capacity aware ninja avoids oversaturing the system with excessive additional jobs. Note that since the load average is an exponentially damped moving sum, build systems that take the load average into consideration to limit the load average to the number of available processors will always (slightly) overprovision the system with tasks. Eventually, this change decreases the aggressiveness ninja schedules new jobs if the '-l' knob is used, and by that, the level of overprovisioning, to a reasonable level compared to the status quo. It should be mentioned that this means that an individual build using '-l' will now be potentially a bit slower. However, this can easily be fixed by increase the value provided to the '-l' argument. The benchmarks where performed using the following script: set -euo pipefail VANILLA_NINJA=~/code/ninja-master/build/ninja LOAD_CAPACITY_AWARE_NINJA=~/code/ninja-load-capacity/build/ninja CMAKE_NINJA_PROJECT_SOURCE=~/code/llvm-project/llvm declare -ir PARALLEL_BUILDS=8 readonly TMP_DIR=$(mktemp --directory --tmpdir=/var/tmp) cleanup() { rm -rf "${TMP_DIR}" } trap cleanup EXIT BUILD_DIRS=() echo "Preparing build directories" for i in $(seq 1 ${PARALLEL_BUILDS}); do BUILD_DIR="${TMP_DIR}/${i}" mkdir "${BUILD_DIR}" ( cd "${BUILD_DIR}" cmake -G Ninja "${CMAKE_NINJA_PROJECT_SOURCE}" \ &> "${BUILD_DIR}/build.log" )& BUILD_DIRS+=("${BUILD_DIR}") done wait NPROC=$(nproc) MAX_LOAD=$(echo "${NPROC} + 2" | bc ) SLEEP_SECONDS=300 NINJA_BINS=( "${VANILLA_NINJA}" "${LOAD_CAPACITY_AWARE_NINJA}" ) LAST_NINJA_BIN="${LOAD_CAPACITY_AWARE_NINJA}" for NINJA_BIN in "${NINJA_BINS[@]}"; do echo "Cleaning build dirs" for BUILD_DIR in "${BUILD_DIRS[@]}"; do ( "${NINJA_BIN}" -C "${BUILD_DIR}" clean &> "${BUILD_DIR}/build.log" )& done wait echo "Starting ${PARALLEL_BUILDS} parallel builds with ${NINJA_BIN} using -j ${MAX_LOAD}" START=$(date +%s) for BUILD_DIR in "${BUILD_DIRS[@]}"; do ( "${NINJA_BIN}" -C "${BUILD_DIR}" -l "${MAX_LOAD}" &> "${BUILD_DIR}/build.log" )& done wait STOP=$(date +%s) DELTA_SECONDS=$((STOP - START)) echo "Using ${NINJA_BIN} to perform ${PARALLEL_BUILDS} of ${CMAKE_NINJA_PROJECT_SOURCE}" echo "took ${DELTA_SECONDS} seconds on this ${NPROC} core system using -j ${MAX_LOAD}" echo "/proc/loadavg:" cat /proc/loadavg echo "ninja --version:" "${NINJA_BIN}" --version if [[ "${NINJA_BIN}" != "${LAST_NINJA_BIN}" ]]; then echo "Sleeping ${SLEEP_SECONDS} seconds to bring system into quiescent state" sleep ${SLEEP_SECONDS} fi done
* | Merge pull request #2177 from peterbell10/cpsched-2Jan Niklas Hasse2024-02-297-52/+326
|\ \ | | | | | | Add simplified critical path scheduler to improve build times
| * | Simplify scheduler to not use build log/execution timePeter Bell2022-08-105-235/+51
| | |
| * | Merge remote-tracking branch 'upstream/master' into cpsched-2Peter Bell2022-08-1022-176/+783
| |\ \
| * | | Clarify the purpose of active_edges in back-propagationPeter Bell2022-03-081-3/+5
| | | |
| * | | Rename critical_time to critical_time_msPeter Bell2022-03-084-24/+24
| | | |
| * | | Pool: sort equally-weighted edges by priorityPeter Bell2022-03-082-4/+14
| | | |
| * | | Add test and fix priority bugPeter Bell2022-03-083-37/+231
| | | | | | | | | | | | | | | | | | | | AddTarget cannot add edges to the ready queue before the critical time has been computed.
| * | | Add run_time_ms accessors and more commentsPeter Bell2022-03-082-14/+22
| | | |
| * | | Improve comments and retrieve edges into ready_queue directlyPeter Bell2022-03-076-31/+28
| | | |
| * | | Add simple test for EdgeQueuePeter Bell2022-03-072-1/+54
| | | |
| * | | Remove unnecessary whitespacePeter Bell2022-03-072-2/+0
| | | |
| * | | Merge remote-tracking branch 'upstream/master' into cpschedPeter Bell2022-03-0732-209/+1011
| |\ \ \
| * | | | Address review commentsPeter Bell2022-03-073-48/+39
| | | | | | | | | | | | | | | | | | | | | | | | | 1. Move EdgePriorityQueue to graph.h and inherit from priority_queue 2. Add comment about edge->critical_time()
| * | | | Remove redundant includePeter Bell2021-10-081-1/+0
| | | | |
| * | | | Improve heuristic for unknown cost edgesPeter Bell2021-08-272-38/+77
| | | | |
| * | | | Address review commentsPeter Bell2021-08-271-8/+5
| | | | |
| * | | | Fix total_time computationPeter Bell2021-08-271-1/+3
| | | | |
| * | | | Address review commentsPeter Bell2021-08-273-29/+25
| | | | |
| * | | | clang-format diffPeter Bell2021-08-253-17/+12
| | | | |
| * | | | Fix critical time calculationPeter Bell2021-08-252-70/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing algorithm doesn't work because it strictly requires that all outputs are visited before updating an edge. So any task downstream from a task with multiple out-edges may get ignored. The fix is to always propagate your critical time to the next input node, and only place it in the queue if you offer a higher critical time.
| * | | | Change priority_list_ into a std::priority_queue of ready edgesPeter Bell2021-08-252-47/+92
| | | | |
| * | | | Use explicit std:: style and remove debug print statementsPeter Bell2021-08-252-56/+30
| | | | |
| * | | | support explicit build orderNico Weber2021-08-253-6/+200
| | | | |
* | | | | Merge pull request #2385 from icebreaker/missing_include_guardsJan Niklas Hasse2024-02-212-0/+10
|\ \ \ \ \ | | | | | | | | | | | | added missing include guards in order to support easy amalgamation
| * | | | | added missing include guardsMihail Szabolcs2024-02-182-0/+10
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These changes make it easy to build an amalgamated `ninja.cc` that can be used to bootstrap ninja with just a working C++ compiler, without the need for any third-party tools like `cmake` or `python`. *nix c++ -O2 src/ninja_amalgamated.cc -o ninja osx-cross x86_64-apple-darwin19-c++ -O2 src/one.cc -o ninja mingw x86_64-w64-mingw32-c++ -O2 src/ninja_amalgamated.cc -o ninja.exe msvc cl.exe /nologo /Ox /GR- src\ninja_amalgamated.cc /out:ninja.exe
* | | | | GitHub Actions: Only specify major version for actions/upload-release-assetJan Niklas Hasse2024-02-133-4/+4
| | | | |
* | | | | GitHub Actions: Don't specify patch version for actions/upload-release-assetJan Niklas Hasse2024-02-133-4/+4
| | | | |
* | | | | Merge pull request #2373 from digit-google/document-manual-generationJan Niklas Hasse2024-02-131-0/+36
|\ \ \ \ \ | | | | | | | | | | | | README.md: document Manual and Doxygen generation
| * | | | | README.md: document Manual and Doxygen generationDavid 'Digit' Turner2024-02-131-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | Fixes #2362
* | | | | | Merge pull request #2369 from Colecf/dark_modeJan Niklas Hasse2024-02-131-0/+26
|\ \ \ \ \ \ | | | | | | | | | | | | | | Add a dark mode to the docs
| * | | | | | Add a dark mode to the docsCole Faust2023-12-311-0/+26
| | | | | | |
* | | | | | | Merge pull request #2375 from scivision/correct-flagJan Niklas Hasse2024-02-131-3/+3
|\ \ \ \ \ \ \ | |_|/ / / / / |/| | | | | | correction to #2360
| * | | | | | correction to #2360scivision2024-01-161-3/+3
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | in general, flags check have to be to the non-no option. Updates use of deprecated CMake command.
* | | | | | Merge pull request #2360 from Simonhancrew/masterJan Niklas Hasse2024-01-021-0/+10
|\ \ \ \ \ \ | |/ / / / / |/| | | | | [FIX] compile: gcc version > 11.3, treat -Wmaybe-uninitialized as error
| * | | | | [FIX] compile: gcc version > 11.3, treat -Wmaybe-uninitialized as erroryourfather2023-12-301-0/+10
| | | | | |
* | | | | | Merge pull request #2358 from digit-google/fix-canonicalize-path2Jan Niklas Hasse2023-12-282-42/+208
|\ \ \ \ \ \ | |/ / / / / |/| | | | | CanonicalizePath: Remove kMaxComponents limit
| * | | | | CanonicalizePath: Remove kMaxComponents limitDavid 'Digit' Turner2023-12-072-42/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the CanonicalizePath() to fix two issues and improve performance. This is achieved through the following: - Remove the kMaxPathComponents limit entirely, which fixes ninja-build#1732, by dropping the `components` array entirely, in favor of back-tracking the destination pointer. - Properly handle '/' and '\' which were incorrectly converted into an empty string. This fixes ninja-build#2008. - Skip initial '../' components in relative paths, as these are common when referencing source files in build plans, and doing so make the loop after this step run faster in practice since most source files do not need adjustments. - Simplify the inner loop logic by handling the last component (which is not followed by a trailing directory separator) separately. This noticeably improves performance because the inner loop becomes smaller with less branch mis-predictions in the general case. - Never access or copy the caharacter after the end of the input string. - Use memchr() to find the next '/' on Posix, which allows the use of SIMD implementations provided by the C runtime (e.g. through IFUNC functions on Linux), resulting in very noticeable speedup. This is also why a statically Ninja executable will be slower than one that links to the C library dynamically :-/ - Avoid performing any writes when the input path doesn't need any adjustment, which is also quite common. Note that this patch does _not_ remove the 64-bit limit for the `slash_bits` value, which is only used on Win32. Benchmarking was done in several ways: - On Linux, running `hyperfine canon-perftest` to run the canonicalization benchmark program and measure its total running time. Three compilers were used to generate dynamically-linked executables. ``` BEFORE (ms) AFTER (ms) GCC 13.2.0 651 369 Clang 14.0 591 402 Clang 18,0 653 400 ``` - On Windows, running `canon-perftest` 5 times and keeping the best reported average result. The number are slower since they only measure the benched function. ``` BEFORE (ms) AFTER (ms) Mingw64 GCC 12 246 195 ``` - On Linux, run `hyperfine ninja -C out/default -n --quiet` on a large Fuchsia build plan, once with 70000+ pending commands, and once after the build (i.e. `ninja: no work to do`). ```` BEFORE (s) AFTER (s) pre_build 8.789 8.647 post_build 6.703 6.590 ```
* | | | | | Merge pull request #2356 from jhasse/remove-w-dupbuildJan Niklas Hasse2023-12-078-87/+36
|\ \ \ \ \ \ | |/ / / / / |/| | | | | Remove `-w dupbuild` completely, always error on duplicate edges
| * | | | | Improve misleading error message when an output is defined multiple timesJan Niklas Hasse2023-12-066-11/+18
| | | | | |
| * | | | | Remove `-w dupbuild` completely, always error on duplicate edgesJan Niklas Hasse2023-11-294-79/+21
| | |_|_|/ | |/| | | | | | | | | | | | | Step 5, fixes #931.
* | | | | Merge pull request #2302 from Snektron/timer-to-micros-chronoJan Niklas Hasse2023-11-291-2/+7
|\ \ \ \ \ | |/ / / / |/| | | | metrics: use chrono to convert ticks to micros
| * | | | metrics: use chrono to convert ticks to microsRobin Voetter2023-06-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | This was previously causing undefined behavior, as multiplying dt by 1000000 overflowed on some systems. Fixes #2301.
* | | | | Merge pull request #2292 from digit-google/fix-stats-reportJan Niklas Hasse2023-11-223-2/+11
|\ \ \ \ \ | | | | | | | | | | | | Fix stats report
| * | | | | Don't double-count the 'node stat' metric.David 'Digit' Turner2023-09-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is already done in RealDiskInterface::Stat() itself, and removes a confusing duplicate line in `-d stats` output, e.g.: BEFORE: metric count avg (us) total (ms) ... node stat 119145 3.0 355.2 node stat 270673 3.1 834.9 AFTER: metric count avg (us) total (ms) ... node stat 270673 2.9 774.0
| * | | | | Fix .ninja parse time reported by `-d stats`.David 'Digit' Turner2023-09-252-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because Parser::Load() is called recursively during Ninja manifest parsing, the call to METRIC_RECORD() in this function used to over-count the total parsing time (for example, by a factor of 2 for the Fuchsia build). This fixes the problem by introducing a new RECORD_METRIC_IF() macro, which only records anything if a given condition is true. This ensures that metric collection only starts and stops with the outer Parser::Load() call, and none of its recursive sub-calls. The effect on the output of `-d stats`is, for a Fuchsia build plan where `ninja -d stats nothing` takes a bit more than 5s: BEFORE: metric count avg (us) total (ms) .ninja parse 27304 372.6 10172.2 AFTER: metric count avg (us) total (ms) .ninja parse 1 4165297.0 4165.3 Note that |count| went to 1, since there is only one top-level Parser::Load() operation in this build. It would be more if dyndeps files were loaded, which does not happen in this build plan.
* | | | | | Merge pull request #2219 from Siogian/masterJan Niklas Hasse2023-11-221-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | fix garbled error message, like Chinese in Windows
| * | | | | | fix garbled error message, like Chinese in Windowsxiaojian.liang2022-12-041-1/+1
| | | | | | |