Spartan-S63 6 years ago

I don’t know the real narrative driving so much investment in safe C++, but it does appear that Rust is really driving C++ in this respect.

  • steveklabnik 6 years ago

    This proposal and Rust's guarantees are different. For example:

    > We do not attempt to address all aliasing cases or make concurrency safety guarantees. Programmers are still responsible for eliminating race conditions.

    Some stuff is the same:

    > The analysis is local each to function (no whole program analysis),

    Some stuff is in the middle:

    > We do not currently attempt to check of the internals of Owner types, including that we do not attempt to validate the correctness of pointer-based data structures.

    (Often, pointer-based data structures in Rust are implemented with unsafe, which isn't checked. Rust's lifetimes also don't interact with owners, in a sense. But Rust's ownership system does, which is together as one thing in this paper. But all of this is possibly splitting hairs.)

    This is based on a CFG; Rust's historically has been based on lexical scope, but is moving to a CFG soon (NLL is slated for 1.31.)

    Regardless, I'm glad to see anything that makes C++ safer, regardless of motivations. It's all about making software better, not cheering on the home team and booing the away team.

    • comex 6 years ago

      Aliasing is a big one. One of the main reasons Rust’s borrow checker can be painful is that it doesn’t want you to have two mutable references to the same thing, anywhere in your code. In C++, on the other hand, mutable references are the norm, and it’s common to have a multitude of different references to the same object, handed out to any other object that wants to know about it. (And traditionally they wouldn’t even be ref-counted. “Modern C++” encourages the use of shared_ptr, which at least prevents the references themselves from going stale, but that doesn’t help if you have a temporary pointer into a sub-object derived from one reference, e.g. an element of a vector, and then someone invalidates the sub-object through another reference.)

      Well, this lifetime checker, to quote the paper, “aims for low false positives in ‘good’ code that follows modern C++ convention”. And for better or worse, that “modern C++ convention” doesn’t include a drastic rework towards Rust-style pervasive immutability and explicit locking (`RefCell`) if you want to mutate something shared. Nor does it do anything magical to work around the reasons Rust needs that. Indeed, there’s not much that can be done as long as you want to stick with a purely function-local analysis (which, for the record, Rust’s borrow checker is as well) – at least, not without using an even more elaborate system of manual annotations.

      So instead it gives up on dealing with those cases. Which is perfectly reasonable – it’s still going to do a lot of good by catching many common types of mistakes. On the other hand, it’s important to recognize that its limitations aren’t a matter of a few missing pieces or legacy constraints, things that can be easily patched up in the future. As it stands, the design fundamentally sacrifices the ability to make the kind of complete guarantees Rust’s borrow checker can, and that can’t change unless it’s willing to cause significantly more disruption for codebases and programmers.

      • slededit 6 years ago

        I would say modern C++ encourages unique_ptrs and naked pointers for ephemeral references much more than shared_ptr. C++11 invented it after all while shared_ptr has been around longer.

        Some projects have gone as far as to say shared_ptr is code smell since you never declared who the actual owner is. The memory may not be free’d but that doesn’t mean it isn’t stale.

    • duneroadrunner 6 years ago

      > This proposal and Rust's guarantees are different.

      A little different, but only in the sense of being, for the moment, less ambitious (than Rust) about the completeness of the checker's implementation. But I think it substantially demonstrates a straightforward path to achieving the same level of memory safety in C++ as Rust.

      > For example: > > We do not attempt to address all aliasing cases

      It seems to me that a main principle behind the design of the Rust language was to eliminate the aliasing issue, by imposing the "mutable references are exclusive" restriction, as a prerequisite to addressing the memory safety issue. And there was an implication (and sometimes more than that) that as long as C++ can't address the aliasing issue it can't address the memory safety issue (like Rust can).

      I think the key thing that the lifetime checker demonstrates is that this premise is not (quite) right. You don't need to completely eliminate the aliasing issue to achieve (efficient) memory safety, you just need to address it "enough". And that can be done for C++.

      Now, whether the complete elimination of the aliasing issue is, apart from the memory safety implications, a virtue that's worth the (flexibility) cost of the "mutable references are exclusive" restriction, is as far as I'm aware, still a matter of opinion. (Given the existence of the RefCell wrapper in Rust, and the ease (if not elegance) of implementing an "anti-RefCell" wrapper in C++, I'm guessing it's mostly a wash.)

      > > or make concurrency safety guarantees. Programmers are still responsible for eliminating race conditions.

      Data races are a separate issue, but relative to the (single-threaded) memory safety issue, I think there'd be less controversy that it can be addressed in C++ in vaguely similar fashion to Rust. [1]

      But both in the case of Rust and the C++ lifetime checker, I don't think that the cost of the imposed (compile-time) restrictions in terms of code/algorithmic flexibility is being adequately acknowledged. For example, say you have a list (or whatever container) of references to (pre-)existing objects. In both cases, the restrictions require that all the objects must (be known to) outlive the list container even if there is a reference to them in the list for only a short time [2]. Imo this is an impractical restriction. In C++, you could use run-time checked pointers[3][4] to alleviate the restriction without sacrificing memory safety [5]. It's not immediately obvious to me that Rust couldn't have such a run-time checked reference as well. But others would be more qualified to make that assessment.

      [1] shameless plug: https://github.com/duneroadrunner/SaferCPlusPlus#multithread...

      [2] https://github.com/duneroadrunner/misc/blob/master/201/8/Jul...

      [3] https://github.com/duneroadrunner/SaferCPlusPlus#registered-...

      [4] https://github.com/duneroadrunner/SaferCPlusPlus#norad-point...

      [5] https://github.com/duneroadrunner/misc/blob/master/201/8/Jul...

      • roca 6 years ago

        This approach does not "demonstrate a straightforward path to achieving the same level of memory safety in C++ as Rust". One reason is that the approach is tightly constrained to analyze Owners which own a single type of objects, all of which are treated uniformly. So, for example, a class which owns two different kinds of data simply can't be analyzed usefully by this approach. More details here: https://robert.ocallahan.org/2018/09/more-realistic-goals-fo...

        The new lifetimes proposal has explicitly revised its goal from "safe C++ subset" to "catch some common errors". That's good, because it's far more realistic and still very useful, but it's going to be problematic if people think that these guidelines imply the existence of a statically checkable, useful safe C++ subset.

      • steveklabnik 6 years ago

        > I think it substantially demonstrates a straightforward path to achieving the same level of memory safety in C++ as Rust.

        What path is that? It's not clear to me how this is possible without breaking backwards compatibility. This statement is directly at odds with

        > You don't need to completely eliminate the aliasing issue to achieve (efficient) memory safety, you just need to address it "enough"

        Safe Rust isn't "enough" safe, it is 100% (proofs pending, of course) safe. A "safe enough" system is not equivalent.

        Maybe if you mean that the existence of Unsafe Rust means that it's "enough", well fair. I still think it's substantially different; this proposal doesn't even get to being 100% safe.

        That said, as I said above, I very much welcome any sort of incremental improvement here.

        • duneroadrunner 6 years ago

          > It's not clear to me how this is possible without breaking backwards compatibility.

          What do you mean? Presumably most existing sizable codebases will not satisfy the requirements of the (eventual completed) lifetime checker, even if the code is actually safe.

          > > You don't need to completely eliminate the aliasing issue to achieve (efficient) memory safety, you just need to address it "enough"

          > Safe Rust isn't "enough" safe, it is 100% (proofs pending, of course) safe. A "safe enough" system is not equivalent.

          I'm asserting (perhaps mistakenly) that you don't need to address the aliasing issue as completely as (Safe) Rust does in order to achieve the same memory safety that (Safe) Rust does. Or maybe "completely" is not exactly the right word. I'm saying that the universal imposition of the "mutable references are exclusive" restriction is not a necessary prerequisite to (fully) achieving the same type of ("zero overhead") memory safety that (Safe) Rust does.

          I'm not that familiar with Rust, but for example, if you have a reference to an element in a dynamic container, like a vector, (Safe) Rust ensures that that element is not prematurely deallocated by ensuring that there is no simultaneously existing mutable reference to the container. I.e. the container is immutable while an (immutable) reference to one of its elements exists. Right? (I mean, assuming you didn't "split" it first.) And if the reference to the element is a mutable reference, then the container cannot be referenced at all, right?

          From a memory safety perspective, this is overkill. The container does not have to be immutable (or inaccessible) while a reference to an element exists, only its structure needs to be immutable. The C++ lifetime checker imposes this lesser restriction.

          And for simple objects that do not have dynamic structure (or any indirect references), then the "mutable references are exclusive" restriction provides no memory safety benefit at all, right?

          As I said, even this lesser restriction will presumably break most existing C++ codebases, and you could (I think legitimately) argue that that makes this new "Safe" C++ a substantially different language than traditional C++. But the fact that the restrictions are much less severe than Rust's means that a lot fewer code modifications will be required than if a Rust-style universal "mutable references are exclusive" restriction had been adopted.

          Am I making sense here? Maybe I'm mistaken and these "lesser" restrictions are somehow inadequate, but I don't see it.

          • steveklabnik 6 years ago

            > I'm asserting (perhaps mistakenly) that you don't need to address the aliasing issue as completely as (Safe) Rust does in order to achieve the same memory safety that (Safe) Rust does.

            You need to address the aliasing, or address the mutability. They're two sides of the same coin.

            > only its structure

            I am not 100% sure what distinction you're making here, sorry. What's "the container" vs "its structure"?

            > From a memory safety perspective, this is overkill.

            Yes, in general, Rust takes a soundness-based approach. If you can't prove that it's safe, then it's not safe. This takes the other path, which is totally valid, mind you! But that means it will allow cases that are not safe.

            > And for simple objects that do not have dynamic structure (or any indirect references), then the "mutable references are exclusive" restriction provides no memory safety benefit at all, right?

            That's not right. You can have a data race to a plain old integer.

            • duneroadrunner 6 years ago

              > That's not right. You can have a data race to a plain old integer.

              Sure, if your language allows unprotected access to any object from any thread. Which, I guess traditional C++ essentially does, but presumably a "Safe" C++ would eventually have an "asynchronous sharing" checker that would require any shared objects to be appropriately "protected".

              > > only its structure

              > I am not 100% sure what distinction you're making here, sorry. What's "the container" vs "its structure"?

              For example:

                  std::vector<int> vec1 {1, 2};
                  {
                      const auto& cref1 = vec1.at(0);
                      auto& ref2 = vec1.at(1);
                      ref2 = 3;
                      auto& ref1 = vec1.at(0);
                      std::cout << cref1;
                      ref1 = 4;
                      
                      // co-existing const and non-const references are permitted and memory safe here
                      
                      std::cout << vec1.size();
                              
                      vec1.at(0) = 5;
              
                      vec1.clear(); // <---- Rejected by the lifetime checker
                      
                      // because the clear() call mutates the structure.
                      
                      // Mutating the data contained in the vector is permitted though.
                      
                      std::cout << cref1;
                  }
              
              > Yes, in general, Rust takes a soundness-based approach. If you can't prove that it's safe, then it's not safe.

              The approach is not that different. The lifetime checker applies (or will apply) basically the same sorts of restrictions that the Rust compiler does (and "break" backward compatibility in the process), but only when necessary to enforce memory safety.

              I mean, the way the lifetime checker works is that it basically keeps track, at compile-time, of the latest possible death-time of every reference and the earliest possible death-time of the target object (or potential target objects) that each reference points at, and complains anytime the former is later than the latter.

              • Jweb_Guru 6 years ago

                I think you will be disappointed if you expect an approach that doesn't do roughly what Rust does aliasing-wise, and doesn't do something very conservative on >1 word sized updates, to be memory safe in the presence of concurrency. People have been working on that problem for a really long time and I frankly don't see any approach that is going to work in a C++ environment other than Rust's. For the single threaded case, sure, you can probably get close with something much more relaxed. But the Rust core team is not stupid, they didn't insist on such stringent aliasing rules just so you could use restrict.

          • comex 6 years ago

            You're on the right track. Rust has a type called Cell [1] (not to be confused with RefCell), which is just a wrapper around T with no overhead. Cell has what Rust calls "interior mutability", which means you can both read and write the contained value using a nominally immutable reference. This is safe within Rust's type system because it doesn't let you get a reference to the contained value (which could then become dangling), only move the whole value in and out. It's also not safe to use from multiple threads, since data races are UB in both Rust and C (to do that properly you'd need to use atomics); but Rust automatically prevents you from sharing references to Cells across threads (because Cell is marked as !Sync).

            Anyway, in practice Cell can be used to wrap individual fields of basic data types, e.g. Cell<u32>, and you effectively have an aliasable mutable field. For such types, the inability to get a reference to the interior doesn't really matter, since there's rarely a reason to take a reference rather than just copying the value. However, the syntax is pessimized: you have to do field.get() and field.set(val) instead of getting and setting the field normally. In theory, this could be built into the language (rather than being mostly a library feature) and it could be quite a bit nicer; I remember seeing that proposed at least once. The reasons it hasn't happened so far are, I think, at least partly philosophical.

            (edit: This can also be used for your vector example, as you can have a Vec<Cell<u32>>.)

            [1] https://doc.rust-lang.org/nightly/std/cell/struct.Cell.html

  • 72deluxe 6 years ago

    I am not sure about Rust causing this. I think the safety has long been a goal of any C++ developer, particularly with large projects.

    I mean, I've touched or looked at rust yet safe C++ is very very high on my list of "important things" and amongst my C++ developer colleagues, none of which have looked at Rust either.

    • MaulingMonkey 6 years ago

      > I am not sure about Rust causing this. I think the safety has long been a goal of any C++ developer, particularly with large projects.

      I'd guess that the yearning for safety in C++ may have been what's been driving Rust development, if anything.

      Rust's whole appeal to me is it takes the scattered, compiler-specific, opt-in, limited annotation-driven static analysis that I've always been retrofitting into existing C++ codebases and catching problems with... and replaced those checks with a unified, language-standard, opt-out, powerful and pervasive thing, that'll be in use from the get-go in Rust codebases.

    • childintime 6 years ago

      The alternative is to write new stuff in Rust and link as usual. And get the many other benefits beside lifetimes. Specially if, as you say "safe C++ is very very high on my list of \"important things\"". The safety Rust offers makes you a better programmer, because in C++ your focus is partially (mostly?) tied up to provide guarantees Rust provides for free. I presume you can't imagine the breath of fresh air that is.

      The only price to pay is give up objects, and that may be a good thing.

      • jjnoakes 6 years ago

        > The only price to pay is give up objects

        There's other prices to pay, of course.

        Rust doesn't target as many systems as C++, for example.

        Adding Rust to a project means someone has to update your build/ci/cd/etc infrastructure, your development machines need rust installed and kept up-to-date, and your developers have to become familiar with Rust.

        There will also probably be pain points when crossing the C++ <-> Rust boundary (since you have to do so at mostly the C level) - you lose lifetime information, you have to convert data types back and forth, etc.

        These may or may not be show-stoppers, but there's certainly more than near-zero friction in running and maintaining a C++ code base with incremental pieces rewritten in Rust.

        (I'm a huge Rust fan, btw).

        • blub 6 years ago

          What's a nice way to call Rust from C++? If the answer is through a C interface then I am (still) disappointed.

          • bluejekyll 6 years ago

            You can make decent wrappers in both languages that provide a similar feeling so that you have a good C++ or Rust API over C.

            But yes, out of the box, there aren’t simple ways of exposing native interfaces across the language boundaries.

            • jerf 6 years ago

              It's worth pointing out there really isn't a such thing anyhow.

              We spent many decades with all our "cool, exotic" languages like Python and Perl and PHP and such just giving up and tying themselves to C semantics, so the C ABI could serve as a de-facto interchange format.

              But in what I think is a sign that the C dominance is finally starting to fade, more languages are getting away from that, and while they still need a C bridge, it's getting more and more isolated from the rest of the languages.

              The core problem is that the cross-language interface isn't just about "calling a function"; there's semantics around who owns what, who's responsible for deallocation, garbage collection, degrees of thread safety and memory boundaries, and on and on and on it goes. It is a bit of an illusion from the last several decades that there is any way for there to be a "generic cross-language interface", because under the hood everything was designed to either be "C memory semantics" or "C memory semantics plus whatever other stuff we can stick on that doesn't conflict with that".

              By contrast (and as the major exceptions to "everything is just C"), observe how "easy" it is for JVM-based or .Net languages to share objects, because the runtime environment makes almost all the hard decisions for you.

      • logicchains 6 years ago

        >The safety Rust offers makes you a better programmer, because in C++ your focus is partially (mostly?) tied up to provide guarantees Rust provides for free.

        Unfortunately there are still some very nice safety benefits that C++ provides but Rust doesn't: those relying on integer template parameters. C++ for instance allows checking at compile time that matrices with sizes known at compile time can be multiplied together, while in Rust this would be a runtime error. Even if Rust got integer template parameters, it's still a long way away from supporting variadic template parameters, which allow creating tensors of arbitrary dimension and for checking operations on them at compile time.

        This is not just a safety issue but also a performance issue, as libraries like xtensor(https://github.com/QuantStack/xtensor) allow stack-allocating such arrays, but (correct me if I'm wrong, this may be outdate) there's no way to implement a stack-allocated tensor of arbitrary dimensionality in Rust. Expression templates, an important technique for improving linear algebra performance by performing optimsation of computation graphs at compile time (kind of like Haskell rewrite rules, but implemented entirely as libraries) are also harder to use in Rust.

        • steveklabnik 6 years ago

          We've accepted an RFC for this, and hopefully will have an implementation on nightly by the end of the year, with stabilization sometime next year, incidentally.

          • logicchains 6 years ago

            https://www.reddit.com/r/rust/comments/6nexpo/variadic_gener... suggests that we however won't be seeing variadic generics for quite some time, so it still won't be possible to build a library like xtensor supporting tensors of arbitrarily many dimensions.

            • steveklabnik 6 years ago

              All of that stuff mentioned that would happen before it has almost happened.

              There isn’t an accepted RFC yet, so it will be some time still, but it is a desired feature.

          • pjmlp 6 years ago

            So can we hope to see something like Eigen in Rust?

      • pjmlp 6 years ago

        That works for CLI applications on desktop platforms, however there are many domains where C++ is used where the Rust tooling is not even available.

        Embedded, GPGPU, certified compilers, HPC, Fintech, OS drivers, hardware design,...

        So any improvement that brings more safety to C++ is welcome.

        • bluejekyll 6 years ago

          As an aside, Rust is making significant strides in many of those areas. It hasn’t had the longevity of C++ which already has all of that, but it is getting better.

          Things take time and energy.

          • pjmlp 6 years ago

            Agreed, that is why I like Rust and still argue for C++ at the same time.

            It took decades for C++ to reach where it is, so any improvement to handle C++ caveats for current codebases is great.

            In the meantime Rust, or any other safer systems languages, can keep improving and slowly start replacing C++ on new projects.

            Same applies to some kind of Safe C, Checked C or whatever finally manages to win the hearts of UNIX kernel devs.

          • Koshkin 6 years ago

            ... while C++ already has taken space and enjoys momentum.

      • blub 6 years ago

        You got it the wrong way around: C++ makes you a better programmer, in the same way that juggling chainsaws makes you a better juggler than merely juggling oranges.

        • bluejekyll 6 years ago

          That’s an extremely interesting analogy. But I think you have something wrong in it. You don’t become a better juggler by juggling chainsaws, you must be a very good juggler before you juggle with chainsaws. Otherwise you cut your hands off.

          C++ I guess is as close to juggling chainsaws as you can get in software, maybe C is even closer, but it means that you’re constantly doing the equivalent of cutting off your hands. Luckily that’s only allegorically.

          • pjmlp 6 years ago

            C++ is chainsaws with grips, while C is double edged chainsaws. :)

            Ideally we would get chainsaws with protective cover, but it takes years to get there.

            • estebank 6 years ago

              I believe this analogy works all too well: C is a chainsaw that has a smaller chainsaw coming out of the opposite side on a pseudo-deterministic basis (UB). You're "fine" as long as you know how to avoid it.

  • imglorp 6 years ago

    Not everyone gets to write new things :) There's a ton of existing codebases where there's no appetite for a rewrite from scratch, but some incremental improvements can be tolerated.

  • rwj 6 years ago

    This is part of the ongoing evolution. The introduction of move operations and smart points in C++11 really started a movement towards "safe by design" (Herb Sutter's words).

  • pjmlp 6 years ago

    Rust ideas are indeed coming into C++ community, but safety was always part of the culture.

    A big part of C++'s culture was that we don't need to write unsafe C code unless we really must. That is why Bjarne designed it in first place, after his BCPL experience.

    The fact that many C refugees use C++ as pretty C, is a sad side effect of the copy-paste compatibility.

    • jjnoakes 6 years ago

      > A big part of C++'s culture was that we don't need to write unsafe C code unless we really must

      In C++ you don't have to drop down to "unsafe C code" to lose lots of safety guarantees (like data races, iterator invalidation, danging pointers, use-after-free, use-after-move, etc)...

      • pjmlp 6 years ago

        True, but you have missed the part that makes C++ safer than C.

        - User defined types with invariants

        - templates, const, constexpr and if constexpr instead of macros

        - proper string types

        - proper vector types

        - type safe enumerations

        - many implicit conversions in C are compiler errors in C++

        - RAII

        - contracts

        - generic data structures

        The iterator invalidation is easy to find out in compilers like Visual C++, their STL implementation can be customized to validate iterator invariants.

        The danging pointers, use-after-free, use-after-move mentioned by you are the purpose of this proposal.

        I was an early C++ adopter in 1992, LLVM rejuvenated language research in 2003, DirectX and Metal shaders are a C++ dialect, AUTOSAR has migrated from C to C++14 only two years ago, NVidia GPUs since Volta (2017) are now being designed to run C++ code efficiently.

        A time frame of about 30 years for C++ to become widespread as it is.

        I am all for Rust or similar language to replace C++ and do away with C copy-paste culture, but I am not throwing away all my code just because.

        So yeah, it isn't perfect, but it surely is way better than plain C coding and the community does try to promote not doing C in C++.

        CppCon 2015: Kate Gregory “Stop Teaching C"

        https://www.youtube.com/watch?v=YnWhqhNdYyk

        • jjnoakes 6 years ago

          > True, but you have missed the part that makes C++ safer than C.

          Of course C++ is safer than C.

          I was just responding to what I quoted, which unfairly lumps "unsafe" and "C" in the same turn of phrase, implying that avoiding C-like code avoids unsafety as well.

    • masklinn 6 years ago

      > safety was always part of the culture.

      That's a joke right? https://en.cppreference.com/w/cpp/utility/optional/operator* screams the exact opposite of "safety is part of the culture". And last I checked std::expected currently under specification does the exact same thing.

      > A big part of C++'s culture was that we don't need to write unsafe C code unless we really must.

      Because you've got plenty of unsafe C++ you can write instead?

      • pjmlp 6 years ago

        It is no joke, the language offers the tools for security conscious developers not to write plain C, while being able to easily write safe wrappers to C libraries.

        And since no one is going to rewrite all UNIX derived OSes and their C userland into Rust, that kind of compatibility is desired.

        Anyone that spends a couple of minutes going through C++ conference talks knows that the community does care about how to make those language features to good use.

        When was the last C conference with security related talks?

        • bluejekyll 6 years ago

          Tools like bindgen have made exposing C headers to Rust pretty simple and easy to integrate C into Rust projects.

          I don’t really think it’s that much different than the C/C++ macro style used for crossing that bridge between the two languages at the end of the day.

          • pjmlp 6 years ago

            It is, there is no bridge as C is a C++ subset, so you can just #include and code away.

            Naturally that convenience is also the cause of our beloved CVEs.

            • bluejekyll 6 years ago

              While that’s true in principle, what I was getting at is that often there are CPP macros to make the C <-> C++ bindings more native between the two envs.

kvark 6 years ago

> It aims to detect common local cases of dangling pointers/iterators/string_views/spans/etc

Hmm, a good tool to have, but wouldn't be quite enough to sleep well, since it's only for "common cases".

usefulcat 6 years ago

"I love C++. I also love safe code and not having to worry about dangling pointers and iterators and views. So I’ve been doing some work to make my life less conflicted"

I love the honesty there :)