update from origin 2020-06-18 #4

richkadel · 2020-06-18T22:43:23Z

No description provided.

We have been seeing some very inefficient code that went away when using `-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at `-Oz` was compiled into a function which consisted entirely of saving registers to the stack, then using the frame pointer to restore the same registers (without any instructions between the prolog and epilog). The RISC-V LLVM backend supports frame pointer elimination, so it makes sense to allow this to happen when using Rust. It's not clear to me that frame pointers have ever been required in the general case. In #61675 it was pointed out that this made reassembling stack traces easier, which is true, but there is a code generation option for forcing frame pointers, and I feel the default should not be to require frame pointers, given it demonstrably makes code size worse (around 10% in some embedded applications). The kinds of targets mentioned in #61675 are popular, but should not dictate that code generation should be worse for all RISC-V targets, especially as there is a way to use CFI information to reconstruct the stack when the frame pointer is eliminated. It is also a misconception that `fp` is always used for the frame pointer. `fp` is an ABI name for `x8` (aka `s0`), and if no frame pointer is required, `x8` may be used for other callee-saved values. This commit does ensure that the standard library is built with unwind tables, so that users do not need to rebuild the standard library in order to get a backtrace that includes standard library calls (which is the original reason for forcing frame pointers).

* Check for overflow when calculating the slice start & end position. * Align the pointer obtained from the allocator, ensuring that it satisfies user requested alignment (the allocator is only asked for layout compatible with u8 slice). * Remove an incorrect assertion from DroplessArena::align.

Return a pointer from `alloc_raw` instead of a slice. There is no practical use for slice as a return type and changing it to a pointer avoids forming references to an uninitialized memory.

When we're running with dry_run enabled (i.e. all builds do this initially), we're guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice, we do test tools as well, so for those tools we would initially record them as being TestPass, and then later on re-record the correct state after actually testing them. However, this would not work well if the build failed for whatever reason (e.g. panicking in bootstrap, or as was the case in 73097, clippy failing to test successfully), we would just go on believing that things passed when they in practice did not. This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it would be recorded when building it); eventually that'll likely move to other tools as well but not yet. This is deemed simpler than checking everywhere we generically save toolstate. We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py invocation; this should no longer be technically required but provides the nice state of letting us check toolstate for all tools and only then check clippy (giving full results on every build).

store `ObligationCause` on the heap Stores `ObligationCause` on the heap using an `Rc`. This PR trades off some transient memory allocations to reduce the size of–and thus the number of instructions required to memcpy–a few widely used data structures in trait solving.

@oli-obk

Avoid prematurely recording toolstates When we're running with dry_run enabled (i.e. all builds do this initially), we're guaranteed to save of a toolstate of TestFail for tools that aren't tested. In practice, we do test tools as well, so for those tools we would initially record them as being TestPass, and then later on re-record the correct state after actually testing them. However, this would not work well if the build failed for whatever reason (e.g. panicking in bootstrap, or as was the case in #73097, clippy failing to test successfully), we would just go on believing that things passed when they in practice did not. This commit also adjusts saving toolstate to never record clippy explicitly (otherwise, it would be recorded when building it); eventually that'll likely move to other tools as well but not yet. This is deemed simpler than checking everywhere we generically save toolstate. We also move clippy out of the "toolstate" no-fail-fast build into a separate x.py invocation; this should no longer be technically required but provides the nice state of letting us check toolstate for all tools and only then check clippy (giving full results on every build). r? @oli-obk Supercedes #73275, also fixes #73274

Apparently it got changed.

Check for overflow in DroplessArena and align returned memory * Check for overflow when calculating the slice start & end position. * Align the pointer obtained from the allocator, ensuring that it satisfies user requested alignment (the allocator is only asked for layout compatible with u8 slice). * Remove an incorrect assertion from DroplessArena::align. * Avoid forming references to an uninitialized memory in DroplessArena. Helps with #73007, #72624.

Don't run generator transform when there's a TyErr Not sure if this might cause any problems later on, but we shouldn't be hitting codegen or const eval for the produced MIR anyways, so it should be fine. cc #72685 (comment)

@kinnison

…kinnison Re-order correctly the sections in the sidebar Before that, "trait implementations" and "implementors" titles in the sidebar were before "methods" for example. Which wasn't logical considering that the two sections come after in the "content". r? @kinnison

Use track caller for bug! macro

… r=Mark-Simulacrum Add more info to `x.py build --help` on default value for `-j JOBS`.

Fix typo in docs of std::mem

Use `Ipv4Addr::from<[u8; 4]>` when possible Resolve this comment: #73331 (comment)

Fix forge-platform-support URL Apparently it got changed.

@ghost

Rollup of 8 pull requests Successful merges: - #73237 (Check for overflow in DroplessArena and align returned memory) - #73339 (Don't run generator transform when there's a TyErr) - #73372 (Re-order correctly the sections in the sidebar) - #73373 (Use track caller for bug! macro) - #73380 (Add more info to `x.py build --help` on default value for `-j JOBS`.) - #73381 (Fix typo in docs of std::mem) - #73389 (Use `Ipv4Addr::from<[u8; 4]>` when possible) - #73400 (Fix forge-platform-support URL) Failed merges: r? @ghost

Update LLVM submodule Only includes one commit: - [D80759](https://reviews.llvm.org/D80759): Fix FastISel dropping srcloc metadata from InlineAsm Fixes #40555

@fintelia

…uppe,Mark-Simulacrum [RISC-V] Do not force frame pointers We have been seeing some very inefficient code that went away when using `-Cforce-frame-pointers=no`. For instance `core::ptr::drop_in_place` at `-Oz` was compiled into a function which consisted entirely of saving registers to the stack, then using the frame pointer to restore the same registers (without any instructions between the prolog and epilog). The RISC-V LLVM backend supports frame pointer elimination, so it makes sense to allow this to happen when using Rust. It's not clear to me that frame pointers have ever been required in the general case. In #61675 it was pointed out that this made reassembling stack traces easier, which is true, but there is a code generation option for forcing frame pointers, and I feel the default should not be to require frame pointers, given it demonstrably makes code size worse (around 10% in some embedded applications). The kinds of targets mentioned in #61675 are popular, but should not dictate that code generation should be worse for all RISC-V targets, especially as there is a way to use CFI information to reconstruct the stack when the frame pointer is eliminated. It is also a misconception that `fp` is always used for the frame pointer. `fp` is an ABI name for `x8` (aka `s0`), and if no frame pointer is required, `x8` may be used for other callee-saved values. --- I am partly posting this to get feedback from @fintelia who introduced the change to require frame pointers, and @hanna-kruppe who had issues with the original PR. I would understand if we wanted to remove this setting on only a subset of RISC-V targets, but my preference would be to remove this setting everywhere. There are more details on the code size savings seen in Tock here: tock/tock#1660

@oli-obk

Fix link error with #[thread_local] introduced by #71192 r? @oli-obk

linker: Never pass `-no-pie` to non-gnu linkers Fixes #73370

This is a combination of 18 commits. Commit #2: Additional examples and some small improvements. Commit #3: fixed mir-opt non-mir extensions and spanview title elements Corrected a fairly recent assumption in runtest.rs that all MIR dump files end in .mir. (It was appending .mir to the graphviz .dot and spanview .html file names when generating blessed output files. That also left outdated files in the baseline alongside the files with the incorrect names, which I've now removed.) Updated spanview HTML title elements to match their content, replacing a hardcoded and incorrect name that was left in accidentally when originally submitted. Commit #4: added more test examples also improved Makefiles with support for non-zero exit status and to force validation of tests unless a specific test overrides it with a specific comment. Commit #5: Fixed rare issues after testing on real-world crate Commit #6: Addressed PR feedback, and removed temporary -Zexperimental-coverage -Zinstrument-coverage once again supports the latest capabilities of LLVM instrprof coverage instrumentation. Also fixed a bug in spanview. Commit #7: Fix closure handling, add tests for closures and inner items And cleaned up other tests for consistency, and to make it more clear where spans start/end by breaking up lines. Commit #8: renamed "typical" test results "expected" Now that the `llvm-cov show` tests are improved to normally expect matching actuals, and to allow individual tests to override that expectation. Commit #9: test coverage of inline generic struct function Commit #10: Addressed review feedback * Removed unnecessary Unreachable filter. * Replaced a match wildcard with remining variants. * Added more comments to help clarify the role of successors() in the CFG traversal Commit #11: refactoring based on feedback * refactored `fn coverage_spans()`. * changed the way I expand an empty coverage span to improve performance * fixed a typo that I had accidently left in, in visit.rs Commit #12: Optimized use of SourceMap and SourceFile Commit #13: Fixed a regression, and synched with upstream Some generated test file names changed due to some new change upstream. Commit #14: Stripping out crate disambiguators from demangled names These can vary depending on the test platform. Commit #15: Ignore llvm-cov show diff on test with generics, expand IO error message Tests with generics produce llvm-cov show results with demangled names that can include an unstable "crate disambiguator" (hex value). The value changes when run in the Rust CI Windows environment. I added a sed filter to strip them out (in a prior commit), but sed also appears to fail in the same environment. Until I can figure out a workaround, I'm just going to ignore this specific test result. I added a FIXME to follow up later, but it's not that critical. I also saw an error with Windows GNU, but the IO error did not specify a path for the directory or file that triggered the error. I updated the error messages to provide more info for next, time but also noticed some other tests with similar steps did not fail. Looks spurious. Commit #16: Modify rust-demangler to strip disambiguators by default Commit #17: Remove std::process::exit from coverage tests Due to Issue rust-lang#77553, programs that call std::process::exit() do not generate coverage results on Windows MSVC. Commit #18: fix: test file paths exceeding Windows max path len

Don't run `resolve_vars_if_possible` in `normalize_erasing_regions` Neither `@eddyb` nor I could figure out what this was for. I changed it to `assert_eq!(normalized_value, infcx.resolve_vars_if_possible(&normalized_value));` and it passed the UI test suite. <details><summary> Outdated, I figured out the issue - `needs_infer()` needs to come _after_ erasing the lifetimes </summary> Strangely, if I change it to `assert!(!normalized_value.needs_infer())` it panics almost immediately: ``` query stack during panic: #0 [normalize_generic_arg_after_erasing_regions] normalizing `<str::IsWhitespace as str::pattern::Pattern>::Searcher` #1 [needs_drop_raw] computing whether `str::iter::Split<str::IsWhitespace>` needs drop #2 [mir_built] building MIR for `str::<impl str>::split_whitespace` #3 [unsafety_check_result] unsafety-checking `str::<impl str>::split_whitespace` #4 [mir_const] processing MIR for `str::<impl str>::split_whitespace` #5 [mir_promoted] processing `str::<impl str>::split_whitespace` #6 [mir_borrowck] borrow-checking `str::<impl str>::split_whitespace` #7 [analysis] running analysis passes on this crate end of query stack ``` I'm not entirely sure what's going on - maybe the two disagree? </details> For context, this came up while reviewing rust-lang#77467 (cc `@lcnr).` Possibly this needs a crater run? r? `@nikomatsakis` cc `@matthewjasper`

HWAddressSanitizer support # Motivation Compared to regular ASan, HWASan has a [smaller overhead](https://source.android.com/devices/tech/debug/hwasan). The difference in practice is that HWASan'ed code is more usable, e.g. Android device compiled with HWASan can be used as a daily driver. # Example ``` fn main() { let xs = vec![0, 1, 2, 3]; let _y = unsafe { *xs.as_ptr().offset(4) }; } ``` ``` ==223==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdeffff0050 at pc 0xaaaad00b3468 READ of size 4 at 0xefdeffff0050 tags: e5/00 (ptr/mem) in thread T0 #0 0xaaaad00b3464 (/root/main+0x53464) #1 0xaaaad00b39b4 (/root/main+0x539b4) #2 0xaaaad00b3dd0 (/root/main+0x53dd0) #3 0xaaaad00b61dc (/root/main+0x561dc) #4 0xaaaad00c0574 (/root/main+0x60574) #5 0xaaaad00b6290 (/root/main+0x56290) #6 0xaaaad00b6170 (/root/main+0x56170) #7 0xaaaad00b3578 (/root/main+0x53578) #8 0xffff81345e70 (/lib64/libc.so.6+0x20e70) #9 0xaaaad0096310 (/root/main+0x36310) [0xefdeffff0040,0xefdeffff0060) is a small allocated heap chunk; size: 32 offset: 16 0xefdeffff0050 is located 0 bytes to the right of 16-byte region [0xefdeffff0040,0xefdeffff0050) allocated here: #0 0xaaaad009bcdc (/root/main+0x3bcdc) #1 0xaaaad00b1eb0 (/root/main+0x51eb0) #2 0xaaaad00b20d4 (/root/main+0x520d4) #3 0xaaaad00b2800 (/root/main+0x52800) #4 0xaaaad00b1cf4 (/root/main+0x51cf4) #5 0xaaaad00b33d4 (/root/main+0x533d4) #6 0xaaaad00b39b4 (/root/main+0x539b4) #7 0xaaaad00b61dc (/root/main+0x561dc) #8 0xaaaad00b3578 (/root/main+0x53578) #9 0xaaaad0096310 (/root/main+0x36310) Thread: T0 0xeffe00002000 stack: [0xffffc0590000,0xffffc0d90000) sz: 8388608 tls: [0xffff81521020,0xffff815217d0) Memory tags around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffef80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffef90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffeff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0xfefceffff000: a2 a2 05 00 e5 [00] 00 00 00 00 00 00 00 00 00 00 0xfefceffff010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Tags for short granules around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffeff0: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. =>0xfefceffff000: .. .. c5 .. .. [..] .. .. .. .. .. .. .. .. .. .. 0xfefceffff010: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags Registers where the failure occurred (pc 0xaaaad00b3468): x0 e500efdeffff0050 x1 0000000000000004 x2 0000ffffc0d8f5a0 x3 0200efff00000000 x4 0000ffffc0d8f4c0 x5 000000000000004f x6 00000ffffc0d8f36 x7 0000efff00000000 x8 e500efdeffff0050 x9 0200efff00000000 x10 0000000000000000 x11 0200efff00000000 x12 0200effe000006b0 x13 0200effe000006b0 x14 0000000000000008 x15 00000000c00000cf x16 0000aaaad00a0afc x17 0000000000000003 x18 0000000000000001 x19 0000ffffc0d8f718 x20 ba00ffffc0d8f7a0 x21 0000aaaad00962e0 x22 0000000000000000 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000ffffc0d8f650 x30 0000aaaad00b3468 ``` # Comments/Caveats * HWASan is only supported on arm64. * I'm not sure if I should add a feature gate or piggyback on the existing one for sanitizers. * HWASan requires `-C target-feature=+tagged-globals`. That flag should probably be set transparently to the user. Not sure how to go about that. # TODO * Need more tests. * Update documentation. * Fix symbolization. * Integrate with CI

lenary and others added 30 commits May 30, 2020 18:24

Fix link error with #[thread_local] introduced by #71192

e9b67d2

store ObligationCause on the heap

af7fbec

use enum to represent ObligationCause::dummy without allocating

ea668d9

Add -O compile flag to test

2af53e9

Don't run test on emscripten which doesn't have threads

01e29c7

Update LLVM submodule

241d16f

Don't run generator transform when there's a TyErr

4004bf1

Avoid forming references to an uninitialized memory in DroplessArena

1f08951

Return a pointer from `alloc_raw` instead of a slice. There is no practical use for slice as a return type and changing it to a pointer avoids forming references to an uninitialized memory.

Join mutiple lines if it is more readable

64a6de2

Re-order correctly the sections in the sidebar

b67bdb5

Use track caller for bug! macro

fe7456c

Disable clippy tests

399bf38

Add more info to x.py build --help on default value for -j JOBS.

b34a417

Fix typo in docs of std::mem

71c54db

linker: Never pass -no-pie to non-gnu linkers

e8cf572

Use Ipv4Addr::from<[u8; 4]> when possible

0e6c333

Fix forge-platform-support URL

60410ef

Apparently it got changed.

Rollup merge of #73339 - jonas-schievink:unbug, r=estebank

5bbcdf5

Don't run generator transform when there's a TyErr Not sure if this might cause any problems later on, but we shouldn't be hitting codegen or const eval for the produced MIR anyways, so it should be fine. cc #72685 (comment)

Rollup merge of #73373 - lzutao:bug-trackcaller, r=Amanieu

94105c2

Use track caller for bug! macro

Rollup merge of #73380 - pnkfelix:make-bootstrap-help-print-num-cpus,…

66a1da3

… r=Mark-Simulacrum Add more info to `x.py build --help` on default value for `-j JOBS`.

Rollup merge of #73381 - ratijas:fix-typo-std-mem, r=jonas-schievink

759547b

Fix typo in docs of std::mem

Rollup merge of #73389 - lzutao:from, r=kennytm

3c437e5

Use `Ipv4Addr::from<[u8; 4]>` when possible Resolve this comment: #73331 (comment)

Rollup merge of #73400 - rnestler:patch-1, r=jonas-schievink

b4dd6a0

Fix forge-platform-support URL Apparently it got changed.

bors added 5 commits June 16, 2020 14:58

Auto merge of #73322 - Amanieu:asm-srcloc-llvm, r=cuviper

e8ff4bc

Update LLVM submodule Only includes one commit: - [D80759](https://reviews.llvm.org/D80759): Fix FastISel dropping srcloc metadata from InlineAsm Fixes #40555

Auto merge of #73065 - Amanieu:tls-fix, r=oli-obk

7d16c1d

Fix link error with #[thread_local] introduced by #71192 r? @oli-obk

Auto merge of #73384 - petrochenkov:gnulink, r=cuviper

e55d3f9

linker: Never pass `-no-pie` to non-gnu linkers Fixes #73370

richkadel merged commit 7ef9eb3 into richkadel:master Jun 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update from origin 2020-06-18 #4

update from origin 2020-06-18 #4

richkadel commented Jun 18, 2020

update from origin 2020-06-18 #4

update from origin 2020-06-18 #4

Conversation

richkadel commented Jun 18, 2020