nathan_f77 6 years ago

> Fear column addition no more

This section was really surprising to read! I use the strong_migrations [1] gem to catch "NOT NULL column with a default value", because it's such a common mistake.

I guess I had assumed that this was just something we had to accept, based on decades of architecture decisions, and the way that Postgres was written. It never occurred to me that it was actually possible to change the default behavior of Postgres and fix the underlying issue.

[1] https://github.com/ankane/strong_migrations#adding-a-column-...

  • booleanbetrayal 6 years ago

    This feature has me way more excited than I probably should be. Been splitting up relatively simple schema migrations into multiple steps for years now, and definitely welcoming this change.

    • mb4nck 5 years ago

      Notably, that feature got written by somebody from Salesforce (I assume on their work time, but not sure), and finished up by a Postgres major contributor. It might be the first major contribution from Salesforce outside of all the stuff Tom Lane did when he was working there.

      • petergeoghegan 5 years ago

        Not so. I worked on UPSERT for several years as a Salesforce employee, as well as quite a number of other things. I worked for Heroku, a Salesforce business unit, from 2013-2017.

jorams 6 years ago

> Now you can quit Postgres by simply typing quit or exit. Previously you had to use Ctrl + D or \q

While this is good to help out beginners a bit, people should really learn Ctrl + D. It works practically everywhere and saves tons of time guessing/remembering and even typing the correct incantation for the program you want to exit.

  • craigkerstiens 6 years ago

    Original author here.

    Yes, it's great to learn Ctrl + D.

    But losing out on beginners that aren't familiar in favor of things that are naturally intuitive is a great way to make those who are new to programming and especially databases feel like they're dumb or not capable of it. I still recall the first time I was in vi and I couldn't figure out how to get out of the damn thing, and don't get me started on arrow keys. There are a lot of ways to help others learn things like Ctrl + D, an even better experience might be a 1 second delay with a message informing them of other ways.

    We could bikeshed all day on which is the more pristine way to exit a program, paint the shed whatever color you want, being intuitive for beginners is never a bad thing.

    • davidw 6 years ago

      I'm reminded of this bit, from https://www.gnu.org/fun/jokes/ed-msg.html

      Let's look at a typical novice's session with the mighty ed:

          golem$ ed
      
          ?
          help
          ?
          ?
          ?
          quit
          ?
          exit
          ?
          bye
          ?
          hello?
          ?
          eat flaming death
          ?
          ^C
          ?
          ^C
          ? 
          ^D
          ?
      ---

      Note the consistent user interface and error reportage. Ed is generous enough to flag errors, yet prudent enough not to overwhelm the novice with verbosity.

      • tormeh 6 years ago

        I think I did something like this with vim when I first encountered it. Now I like the damn thing (but still prefer VS Code).

    • Jedd 6 years ago

      You use the word intuitive a couple of times, and suggest there's a natural way to exit interactive cli programs (and it's not ctrl-d).

      Yet for me, ctrl-d is my usual first way of trying to exit any shell or cli utility.

      Whether it's natural or not, I couldn't say. Arguably the nipple is the first and only intuitive interface, everything after that is learned. After many years on my/maria I find the backslash shortcuts very counterintuitive to 'show databases | tables' etc.

      While I think it's great to have both options present, I don't think the demographic of people who work out they want an rdbms, evaluate and select postgresql, work out how to install server and client components, su to the postgres user, and are then likely to abandon in frustration as they can't determine how to exit the cli, is negligible.

      • craigkerstiens 6 years ago

        The background of what you learned on is key to whats intuitive or natural. If you came up on an unix based system then yes you'd expect this. If you learned databases via MySQL then you'd expect show database to work.

        Today though many more are coming to development without an academic or linux background. If you install Postgres via Postgres.app there is a chance you're not at all familiar with a commandline. What is intuitive... yes it's relative to your background. I very much hope that the number quit or exit is intuitive for, those that don't come from a linux or academic background, grows of time. This does become a more valuable tool the more beginners there are, and personally from working with a lot of beginners already suspect it's more valuable than many realize.

        • Jedd 6 years ago

          > The background of what you learned on is key to whats intuitive or natural.

          That's pretty much the summary of my post.

          > If you install Postgres via Postgres.app ...

          I had to go lookup what postgres.app actually is -- evidently it's an easy way for Mac users to install postgres.

          I don't have a Mac -- which is also true for most people -- so I can't speak to the average mac user's experience or expectations.

          As I have (had to repeatedly) say a few times now, I think it's great that multiple options to end the program are available.

          I'd imagine for Mac users the 'most intuitive' thing would be to find some magic symbol on their keyboard and combine it with the letter Q.

          • solarengineer 6 years ago

            Mac users either don't stop the service at all, or they use the GUI to do so (if they installed via the postgres.app tool)

        • tormeh 6 years ago

          Most CLIs can be exited with "quit" or "exit". Many trap Ctrl-C, so those two are the first things I try. Didn't know about Ctrl-D before now, and I've worked as a sysadm.

      • freeopinion 6 years ago

        CTRL+D works. You like it. Great use that. You don't think anybody needs an alternative? Or you don't think having an alternative will help at all? Fine. Think what you want. One thing's for (mostly) certain: it doesn't hurt. So there's really nothing to complain about.

        • Jedd 6 years ago

          > CTRL+D works. You like it. Great use that. You don't think anybody needs an alternative? Or you don't think having an alternative will help at all? Fine. Think what you want. One thing's for (mostly) certain: it doesn't hurt. So there's really nothing to complain about.

          Not complaining about its presence, as per my comment :

          "While I think it's great to have both options present,..."

          Just observing that claims of 'more intuitive' interfaces are very subjective, and that it was unlikely to be a deal breaker for a new user.

          • sverhagen 6 years ago

            You don't think it's intuitive for a command line application that is entirely driven by commands, there to be a command to exit it? Instead of a key combination? I'm used to Ctrl+D now, but it sure took me a while to figure it out. Are we not confusing intuitive with standardized?

            • Jedd 6 years ago

              I can't say if anyone else is confusing intuitive with standardised.

              I can repeat (again) that I think it's great both options are there.

              Given pretty much all the other 'standard' postgres commands are \[a-z] and sometimes \[a-z][a-z] it sounds like more a complaint about \q

      • wink 6 years ago

        I find it highly unintuitive that ctrl-d only works on an empty line. So despite the fact I've known it for years (decaded at this point?) I still regularly want to exit mid-typed line. And no, it doesn't matter why this is a good thing or not - I'm just saying even if you've known it for years it's still not intuitive that an "end of a file" has to be on a new line.

      • cuboidGoat 6 years ago

        "You use the word intuitive a couple of times, and suggest there's a natural way to exit interactive cli programs (and it's not ctrl-d)."

        Surely it is when you detect the angry mashing of keys?

        There should really be a function for that in the standard library. Distinguishing such inputs from the actions of cats would be the difficult bit.

      • walterstucco 6 years ago

        And yet CTRL+D does not take you out from an Erlang console...

    • user5994461 6 years ago

      The intuitive way to exit anything is Alt + F4.

    • yawboakye 6 years ago

      > But losing out on beginners that aren't familiar in favor of things that are naturally intuitive is a great way to make those who are new to programming and especially databases feel like they're dumb or not capable of it.

      I disagree with you here, Craig. In my experience I reach for Ctrl + D, Ctrl + C in quick succession to see what succeeds. I've learnt those and it works for most programs that don't trap exits (again, why would you trap exit?). It has saved me from learning how to exit interactive terminals for python, ruby, node, PostgreSQL (psql) without learning their specific instruction. I like that and I'd prefer to have one way of exiting programs that works on all programs on the system than learn what's any programs preferred command.

      • SSLy 6 years ago

        >why would you trap exit?

        flush buffers, gracefully close sockets, print the final \LF to the stdout.

      • MaxBarraclough 6 years ago

        > I like that and I'd prefer to have one way of exiting programs that works on all programs on the system than learn what's any programs preferred command.

        I bet you love Vim ;-P

    • eip 6 years ago

      > Ctrl+D

      > things that are naturally intuitive

      Pick one

  • barrkel 6 years ago

    C-d is a great way to accidentally exit more than one shell with remote latency. I only use it then there's no other choice. Most frequent use case is ad-hoc file transfer via terminal paste to remote end running cat > out.txt.

  • tudelo 6 years ago

    Yet, principle of least surprise. What is more surprising, Ctrl+D working or quit/exit not working? Maybe I am an outlier but I would try exit/quit/bye first.

    • sdegutis 6 years ago

      I learned about Ctrl + D pretty early on in my software career, probably around 2009 or so, and have been using it 99.99% of the time since then[1]. It's both consistent and easy, so I really can't see any benefit of not using it.

      [1] The 0.01% of the time was when I was testing out using Windows for software dev. I don't remember but I don't think Ctrl + D works there.

      • dvlsg 6 years ago

        I find myself pressing ctrl+c on windows.

        • eitally 6 years ago

          ctrl+c also works on Crosh (Chrome Shell).

          • y4mi 6 years ago

            But not on Windows cmd

            I think it worked in powershell however, can't test right now though

            Also: chromeos is Linux. It's a safe bet they'll support standard signals

            • dvlsg 6 years ago

              You mean in general or specifically for PostgreSQL? Ctrl+c definitely works in cmd in general, which is what my original post was referencing. I use it all the time.

              Not sure about for PostgreSQL in cmd off the top of my head (posting from my phone). I suspect because I got down voted that it may not work there.

    • TheDong 6 years ago

      Anyone using psql should be familiar with bash. Anyone using bash should be familiar with readline. Anyone familiar with readline should instantly recognize psql as being a readline input.

      It seems like a reasonable set of conclusions to draw, which makes ctrl-d indeed the most obvious thing to try.

      • mad_hominem 6 years ago

        I don’t think this follows. I’ve been at multiple prompts that look like readline but aren’t.

        Also ‘exit’ works in bash.

      • jorams 6 years ago

        Ctrl + D works just fine without readline as well, so you don't even need that conclusion.

  • lkbm 6 years ago

    I'd propose that "quit" should do two things:

    1. Print "You can also quit using Ctrl + D"

    2. Quit.

    But I also believe that only doing step 2 is better than neither. psql should prioritize psql's usability over educating users on Unix standards.

  • t0mbstone 6 years ago

    Control+C is used to kill a process with the signal SIGINT, and can be intercepted by a program so it can clean its self up before exiting, or not exit at all. ... Ctrl+D means end of file. It has no effect if the program isn't reading input from the terminal. Ctrl+C means “interrupt”, i.e., stop what you're doing.

  • peteretep 6 years ago

    What does Python do these days? I seem to remember it would catch common cases, but rather than quit, scold you and tell you how you should be doing it the other way, which seemed like the most Western European approach imaginable.

    • int_19h 6 years ago

         Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)] on win32
         Type "help", "copyright", "credits" or "license" for more information.
      
         >>> quit
         Use quit() or Ctrl-Z plus Return to exit
      
         >>> exit
         Use exit() or Ctrl-Z plus Return to exit
      • int_19h 6 years ago

        Oh, and in true Python fashion, the reason it works is not because the REPL is special-casing "quit". It's because when you evaluate an expression, the REPL prints out its repr. And `quit` is not a function, but rather a callable object with a custom repr:

           >>> type(quit)
           <class '_sitebuiltins.Quitter'>
        
           >>> repr(quit)
           'Use quit() or Ctrl-Z plus Return to exit'
        
        and it's implemented like so:

           class Quitter(object):
               def __init__(self, name, eof):
                   self.name = name
                   self.eof = eof
               def __repr__(self):
                   return 'Use %s() or %s to exit' % (self.name, self.eof)
               def __call__(self, code=None):
                   # Shells like IDLE catch the SystemExit, but listen when their
                   # stdin wrapper is closed.
                   try:
                       sys.stdin.close()
                   except:
                       pass
                   raise SystemExit(code)
        
        The same is done for "help", "copyright", "license" and "credits". REPL only does what its name says it does: read, evaluate, print, and loop.

        So there's no magic. There's only Python.

    • ahartmetz 6 years ago

      Python actually taught me to use Ctrl-D everywhere. Even though I was initially annoyed, I now understand the wisdom behind it :P

      • int_19h 6 years ago

        Except it's Ctrl+Z in Python for Windows, because that's the convention for EOF there.

  • pkulak 6 years ago

    You're wrong. I think the bike shed should be blue.

    ;)

    • viraptor 6 years ago

      This isn't really about alternatives. Bike shedding could be about exit-vs-quit or similar. Ctrl-d exists a layer above though. In a typical terminal / readline, your process will get an EOF / HUP signal which is treated like a clean-exit in most interactive cases and config-reload in many servers.

      Just like ctrl-c generates INT and ctrl-z generates STP. This is worth learning.

      • majewsky 6 years ago

        > Just like ctrl-c generates INT and ctrl-z generates STP.

        It's very different. Ctrl-D does not generate signals. When written into the tty master, it forces the kernel to answer any pending read() syscalls on the tty slave, even if it means returning 0 bytes.

        Programs then apply the convention of treating a read() that returns 0 bytes as meaning EOF (because that's how it works with all other file types that are not tty slaves).

      • colanderman 6 years ago

        Don't forget Ctrl-\. For when you feel like Ctrl-C is too nice. (It sends SIGQUIT.)

    • koolba 6 years ago

      No GP is correct. Control-D sends an EOF (end of file) which indicates there is no more input.

      This isn’t bike shedding at all as there’s a specific character code to indicate you’ve finished with a program and it’s standardized on all sane pieces of software. Heck it even works with things that aren’t specifically programmed to “exit” as they’ll simply end the stream on receipt.

  • threeseed 6 years ago

    What other apps use Ctrl+D ? I can't think of a single one.

    • LukeShu 6 years ago

      In a terminal in cooked mode, Ctrl+D sends EOF. Literally any program that reads from stdin and doesn't do something special to avoid it will handle Ctrl+D.

      That's the point about why it's worth it to learn: It's universal to all terminal programs, even if the author of the program didn't know about it.

    • 1stranger 6 years ago

      Practically any interactive shell. python or bash for instance. It's everywhere.

      • tofflos 6 years ago

        Except Powershell. :(

    • drb91 6 years ago

      Most apps that read from stdin handle EOF correctly.

    • beardicus 6 years ago

      python and node REPLs both do this.

    • TheDong 6 years ago

      Any command-line that uses readline[0], which in practice is an absolute ton of things.

      This includes nodejs's repl, irb, ed, gdb, tftp, virsh, gpg --card-edit, and others.

      It also works for programs that read stdin as if it's a file if you're using a modern readline shell to run them (e.g. cat, tee, etc)

      [0]: https://en.wikipedia.org/wiki/GNU_Readline

      • mortehu 6 years ago

        It has nothing to do with readline, but is a feature of the TTY (i.e. implemented inside the kernel) when in canonical mode. It will work for anything that exits in reaction to read(STDIN_FILENO, ...) returning zero.

  • paulddraper 6 years ago

    Oh no!

    I used to be able to Ctrl+C knowing that it would interrupt any currently running query but never lose my session.

    If I want to quit the PostgreSQL shell, I'd Ctrl+D like any other shell.

    What will I do now?

    • tuukkah 6 years ago

      Where does it say that Ctrl-C handling has changed?

    • grzm 6 years ago

      Ctrl-D still works. The new sql commands additions.

      • paulddraper 6 years ago

        Right, but now I can't Ctrl+C to cancel a query without worrying instead I will exit the shell instead.

        • Shish2k 6 years ago

          You're worried that when you try to hit ctrl-c you might accidentally type "exit"? I'm not sure how else the addition of exit as an alias for ctrl-d would affect your ctrl-c work...

kstrauser 6 years ago

A million things like this are why PostgreSQL is the only relational database I consider for, well, pretty much anything. Their approach of “build it safe and then make it fast” has been paying off in spades for a couple of decades now. Thanks for everything you do, psql maintainers!

  • mrep 6 years ago

    100% naive question: why is mysql and their similars so popular then?

    Spanner and AWS Aurora base off of more mysql than postregsql from what I can tell. Why?

    • dotancohen 6 years ago

      > Why?

      Notice that parallel to MySQL's rise a particular loosely-typed, never-except, often-wrong programming language also became popular. To this day I feed my family with that language.

      The typical coder (not that I do not say "developer") who codes in PHP does not care about correctness. He does not understand why monetary values cannot be stored in floats, he does not know what bitwise manipulation is, he does not know the difference between character encodings. What does he care what MySQL does, he knows that he can put strings in if he calls them VARCHAR, and he can pull the right one out with a WHERE. He cares not enough to check that user bios fit in his VARCHAR, and when he learns to JOIN he does not understand which constraints to put in the ON clause.

      I do have nicer words for the L and A in the stack.

      • rgbrenner 6 years ago

        In fairness, some of this is historical baggage that MySQL got stuck with from it's early popularity.

        mSQL was a free/low cost SQL database in the early 90s. Originally it was an SQL translator built on top of Postgres (that used POSTQUEL). That was too slow (because Postgres had higher system requirements), so a new lightweight engine was developed for it.. and that's what later became MySQL. The point was a lightweight SQL database that ran well on cheap early-90s computers.

        The compromises to make mSQL fast were inherited by MySQL, and they can't be easily changed without breaking the ecosystem that was developed around mSQL/MySQL.

        Postgres on the other hand, was a university project in the 80s and 90s being developed on mainframes. It was always a slow moving, cautious project. Yes, it didn't make the compromises mSQL made, because it didn't have to. It used more memory, more CPU, but was more correct/safe than MySQL.

        That was the same reason it lost in the 90s. mSQL was out being used by websites and other projects on cheap computers. Postgres was mostly waiting for computer tech to get fast enough so they wouldn't need to compromise their code too much to take it off the mainframe an on to cheaper computers. Postgres only adopted SQL in response to mSQL's popularity. Early Postgres was also harder to setup, another point that wasn't addressed until after mSQL took off.

        MySQL was popular because of the compromises they made. If mSQL just kept Postgres as the engine, MySQL never would have existed. And if they instead waited (even a couple of years), and did things correctly, they wouldn't have beat Postgres -- which had a decade lead in development and was more advanced than mSQL/MySQL.

        • evanelias 6 years ago

          > The compromises to make mSQL fast were inherited by MySQL, and they can't be easily changed without breaking the ecosystem that was developed around mSQL/MySQL.

          This doesn't sound accurate to me. Which compromises are you referring to?

          Historically, MySQL's major source of criticism related to leniency of type safety / automatic data conversion -- which is unrelated to performance. It's also essentially a solved problem with the advent of strict sql_mode. MySQL made this the default in 2015, but it has been available (and recommended as a best practice) since 2004.

          Performance in MySQL, as a general topic, greatly depends on the storage engine. Relative to Postgres, MySQL's pluggable storage engine API is both a blessing and a curse -- it permits use of alternative engines that perform significantly better for specific workloads, at the cost of substantial administrative complexity.

          Performance-wise, Postgres generally has the lead for things like number of supported index types, join strategies, query planning for very large queries (and OLAP workloads in general). These may or may not matter for you, depending on your workload.

          • marcosdumay 6 years ago

            The main selling point of MySQL at the earlier 2000's was that it used an in-memory storage with only opportunistic disk writes. That made for an incredibly fast database, with obvious downsides that many people refused to notice.

            • evanelias 6 years ago

              What storage engine are you referring to? I've been using MySQL since the earlier 2000s, and your description doesn't match MyISAM (which uses the filesystem cache for data blocks, only using its own caching for indexes) nor InnoDB.

              MyISAM does offer amazing write performance, at the cost of not being crash-safe. But given its reliance on the fs cache, it would be unusual to describe it as "in-memory storage". Anyway, the GP was talking about compromises that "can't be easily changed without breaking the ecosystem", which doesn't describe MyISAM either -- MyISAM is basically deprecated in modern MySQL deployments.

        • SSLy 6 years ago

          >Postgres only adopted SQL in response to mSQL's popularity.

          What, then what did it use before?

          • rgbrenner 6 years ago

            It had two methods of retrieving data. The first was what was known as a navigational database: https://en.wikipedia.org/wiki/Navigational_database

            This was something that was popular back when you used tape to store data. A record contained navigational references that told you where the next record was, allowing you to fast forward the tape to that position without reading everything in between.

            The 2nd was POSTQUEL, which was a QUEL language: https://en.wikipedia.org/wiki/QUEL_query_languages POSTQUEL was the preferred and recommended way of querying postgres.

            These were both on their way out even in the early years of Postgres (mid-80s). Navigational databases are from the 60s. SQL was invented at IBM in the early 70s, and adopted by Oracle and DB2 in late-70s, and by the mid-80s they had gained significant market share and most databases that used QUEL had moved to SQL around that time.

            • davidgould 6 years ago

              I've been involved with postgres from before it had SQL, at Ilustra we added SQL in a parallel effort to the postgresql open source project and I'm not sure I'd agree with your points:

              Specifically, POSTQUEL was not really obsolete, it's just that the market had picked SQL by then. Compare the fates of git versus Mercurial. Mercurial is not obsolete, but git is the clear popularity winner.

              Second, I don't recall at all a navigational interface to postgres and the idea that this was a primary access method in postgres is quite surprising. Do you have a reference for this? I'd be very curious to read it.

              • rgbrenner 6 years ago

                Sure, of course QUEL still worked... but Postgres was one of the last databases to use QUEL. It was pretty clear SQL won even when Postgre was just getting started (mid-80s). The Ingres project (that developed QUEL) ended by the time Postgres started (obviously, since the name is Postgres = Post Ingres). Ingres influenced Sybase, which switched to SQL in the late 80s. Ingres Corp even switched to SQL in the late 80s. What other databases using QUEL existed in the mid-90s?

                POSTQUEL was definitely the primary interface to postgres. The navigational interface was an early Postgres feature. See page 3 and page 10 ("fast path") of Stonebraker's 1990 paper on the implementation of postgres: http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf

                • davidgould 6 years ago

                  I agree that commercially SQL was winning by the time postgres got started, but I think it was not yet obvious there was room for only one query language. I also suspect that the early implementation shared a lot of Ingres code as the focus was on the "post-relational" topics. Building SQL would have been a distraction. Perhaps also Stonebraker did not want to Osborne Effect Ingres. I'm sure even early Oracle marketing would have had fun with "even the founder of Ingres agrees QUEL is obsolete". Finally there was a lot to like in QUEL and SQL has quite a few warts.

                  Thanks for the link. Fast path was mostly about the function manager and only incidentally about data access. As in, "if you call all these functions that are part of the infrastructure, you can get at data". But that's not really a navigational interface (get first of set), more a side effect of exposing the internals via the function manager.

                  I started at Sybase in 1988, there were a lot Britton Lee alums, who had also worked on the ingres project as well. I ran in the the rest at Illustra in the 90s.

      • sitepodmatt 6 years ago

        This.

        One also has to remember that long ago there was no serverless, no VPS/cloud/ec2/linode instances, no heroku/beanstalk/appengine, no docker/vagrant - to get going fast you had $x/mo shared hosting and [W|X]AMP for development, and this horrible thing called FTP. The only reasonable way to host a website cheaply was on some terrible shared hosting server which only [reasonably] supported php and mysql (okay I lie there was a cgi-bins and mod_perl) otherwise you'd be looking at dedicated servers.

      • edoceo 6 years ago

        That's a broad generalization of PHP "coders"

        I think it's safe to say we've all seen good code a crap code -- regardless of language.

        It's not the wand, it's the magician.

        (nb: I've been building software for 20+ years, I use PHP (among others) and knew JOIN and types in PG before I ever saw PHP, I cannot be the only one)

        • phaer 6 years ago

          Yes, but the proportions between "bad", lets rather call them inexperienced, "coders" and experienced coders varies strongly in different ecosystems.

          PHP and JavaScript both democratized programming, as BASIC did years before, but that also meant that more tutorials and howtos for PHP were written by people relatively new to programming compared to say howtos on Haskell or C.

          • cookiecaper 6 years ago

            It's an "Eternal September" effect that never really wears off. I'm growing increasingly concerned that this is descending upon Python now. The internet as a whole is a wide enough system that it was able to withstand its Eternal September (though it's arguable that newsgroups weren't), but any specific language ecosystem, I fear for the moment it hits "mainstream".

        • dotancohen 6 years ago

          You're not the only one, but you are one in a thousand.

          When I work with Python, Java, C++, bash, SQL, or Javascript in a team I feel mediocre at best. I can get around, but I know that I'm easily outclassed. Contrast with PHP, where I'm almost always the best dev in the room. True that I have much more experience with PHP than in the other technologies that I've mentioned, but PHP has the dual curse of having a very low bar to entry, and seems-to-work enough that most "PHP coders" never feel the need to progress beyond the most basic of understanding.

          When I meet new devs I deliberately try to postpone mention of PHP as long as I can, to avoid attaching myself to the tainted stigma that PHP has acquired.

      • chrisgd 6 years ago

        Is dollars not as floats just a MySQL issue? I am not sure why you wouldn’t want to do this.

        • Tostino 6 years ago

          No, that is not just MySQL. Just do a little googling and you'll run across enough information on why it's bad practice in any language or database. You need to avoid converting dollar values to a floating point at every place in your stack otherwise you will introduce floating-point rounding errors.

          It's not just the data type that it's stored in, if you convert it to that at any point, you lose the exact precision and introduce those errors. It can add up to some serious differences over a lot of values.

          I did some analysis on my database to see if I had stored any of those numbers is floating points, what would the amount difference be in total, and what would be the absolute difference for any single account. it was awhile ago, but I believe that the absolute difference per client was around $20 or $30, but the absolute difference per account within that client was as high as $100. Obviously that's pretty damn unacceptable.

    • rgbrenner 6 years ago

      History. A long time ago, in the early 90s, there was a database called msql. It was a little DB that used SQL and ran well on low end systems. It was also free. In fact, it was the ONLY free SQL database out at the time -- that's why it was written.

      Because of those two advantages (that really only lasted a couple of years), a small ecosystem developed around it. This was the database you used to develop interactive websites in the early/mid-90s.

      Then MySQL came. MySQL extended mSQL, and kept compatibility with it's ecosystem... and people switched from mSQL to MySQL.

      Postgres also existed in the 80s and 90s, but it used POSTQUEL, was harder to setup, and had higher system requirements. Not a lot, but enough to discourage use on low end computers. (Remember, Postgres at the time was being developed on university servers, where they had plenty of system resources.)

      They did address these points in postgre95 (in fact, Postgres adopted SQL as a response to mSQL popularity), but by then it was too late. mSQL and MySQL already had an ecosystem.. and in that brief moment, they lost the race.

      The idea that it was slower stuck around for many years, even after the improvements in computers, and (less so) optimizations in Postgres made it no longer true.

      Postgres didn't start being considered again until Go, RoR, et al started to become popular.. displacing the PHP ecosystem. And the biggest reason you see MySQL around so much today is that there's about 25 years of legacy code built around it.

      Edit: If you're curious what POSTQUEL was like: https://en.wikipedia.org/wiki/QUEL_query_languages

    • manigandham 6 years ago

      MySQL was easier to run.

      Both for beginners where it was packaged with PHP and available with basically every webhost, and for advanced users that took advantage of the simple and reliable replication. There were a lot of problems with it, but the novice users didnt run into them and the advanced users knew how to deal with it. These decades of mainstream use carry a lot of momentum and many large companies continue to use it because they've always used it (like Youtube for example).

      PostgreSQL only recently became much more usable after v9.5 and is now being recognized and growing in popularity.

      • philliphaydon 6 years ago

        PostgreSQL has been useable Long before 9.x, however I think features of 9.x have made it much more popular as of late.

        I think I first started using it at v6 or 7. I liked it could I could run it from a .bat file and spin it up for tests without installing it.

        • manigandham 6 years ago

          > "much more usable "

          It always worked well, but there are many newer developments around operations and scaling that finally made it much easier to run. Even now there are some limitations with logical replication and horizontal scaling but the main system can easily be installed and running in a few commands to serve most users just fine.

          It should be noted that the rise of Docker containers and K8S has also helped greatly in letting users run all kinds of software with minimal work.

        • spion 6 years ago

          Replication is still a pain.

          • scurvy 6 years ago

            Indeed. MySQL replication topologies with GTID are super easy to maintain and run. Does Postgres still require "extra" tools like pg_rewind to do a simple operation like turning a former primary server into a new replica?

            Some of the MySQL backup/restore stuff is a lot more straightforward (innobackup vs WAL shipping).

            • cookiecaper 6 years ago

              GTIDs are relatively new and require some porting work to implement on pre-existing databases. MySQL replication is a massive disaster without them. Like most things in PostgreSQL, there is a bit of extra legwork to get things set up right up front, but when they work, they work.

              MySQL replication with binlog_pos was a total crapshoot. The reason innobackupex exists (and anything from Percona, to be honest) is because MySQL's default options were so unworkable. The Postgres equivalent is probably something like barman.

              • scurvy 6 years ago

                Hyperbole much?

                MySQL replication was workable before GTID. GTID simply made things a lot easier. GTID was hard to work into existing DB's at places like Facebook, but it we managed to get things going without a DBA in several multi-TB databases (sans downtime).

                GTID is over 4 years old at this point. That's not relatively new in the tech world. Especially with the pace of things like Kubernetes and containers.

                There are 3rd party additions to Postgres, too. pg_rewind was written by eBay (?) to address the obvious shortcomings of repointing primaries and replicas.

      • Avamander 6 years ago

        > and is now being recognized and growing in popularity.

        To be fair, PostgreSQL was noticed way before, Estonia's e-nation website was built on it (even the logic!) and I can't honestly remember when that thing has been down. The UI would need a bit of refreshing but I have the feeling if they're going to update it they're going to replace it all with something shittier.

    • twunde 6 years ago

      Historically postgres was slower than MySQL, harder to manage than MySQL, wasn't available on shared hosting and generally speaking wasn't beginner friendly. Because of this a generation of programmers learned MySQL, especially as everything PHP used MySQL. As such the install base for MySQL, was quite large so there were more potential customers for MySQL projects (WordPress, drupal and Joomla all run on mysql, so we're talking about 50% of websites before talking about custom applications)

    • samcheng 6 years ago

      I used both of them, but migrated to Postgres permanently a decade ago.

      Early on in my career, the biggest selling point for Mysql was the excellent web admin tool "phpMyAdmin" - it really helped get applications off the ground, since the core of most modern systems is the data model. Users could modify data, without your needing to create a UI for that use case.

      It conveniently used the same stack as the rest of the software, but I remember spinning up a phpMyAdmin instance even after moving away from PHP as an application language.

      Postgres still doesn't really have anything like that! The closest that comes to mind is Django's Admin tool.

      • gbuk2013 6 years ago

        It was the same for me - back when I first learned SQL I really tried to use Postgres but phpMyAdmin was so much better than pgMyAdmin that I gave up. A lesson in the importance of user-friendly UIs. Perhaps some developers can really nail a schema on first go, but for me it usually takes a lot of tweaking before I can really understand what I want and the CLI is not the best place for that, especially if my mind is deep in the code I am trying to write.

      • Avamander 6 years ago

        JetBrains' IDEs have Postgres data source support, I find that GUI really rather excellent for anything I've had to do and couldn't remember the syntax for.

      • rgbrenner 6 years ago

        phpPgAdmin, started ~2002: http://phppgadmin.sourceforge.net/doku.php

        Basically a port of phpMyAdmin to postgres. I used it for several years around ~2003-5.. it's nearly identical.

        • luhn 6 years ago

          I’m a regular Postgres user and fan, but cut my teeth on LAMP. I can’t speak to the current state of the projects, but phpMyAdmin was vastly superior to phpPgAdmin in the 2000s. phpMyAdmin remains the best DB GUI I’ve ever used. (Although I haven’t used any GUI in quite a few years.)

      • StreamBright 6 years ago

        There are tons of tools like phpMyAdmin for Postgres.

    • tejtm 6 years ago

      Will make an attempt to answer, hopefully without starting anything. Mysql aimed for lowering bar for non-dba-app-developers. Postgres did not. In some cases it matters, in many it does not, especially if you have the dba skills on hand to mind the gaps.

    • evanelias 6 years ago

      The question of MySQL's popularity comes up often in HN Postgres threads. The answers naturally skew towards being a bit one-sided, as most people clicking on a Postgres article tend to be Postgres users currently (and sometimes former MySQL users in years past). This can result in comparisons of modern-day Postgres to 5+ year old versions of MySQL.

      That said, and with a disclaimer that my own bias leans the other way (as I have a lot more MySQL experience than Postgres experience) although I try to be impartial about these things:

      * Spanner: I've heard this second-hand and might be incorrect, but as I understand it, Google's original internal MySQL team was shuttered in ~2009 and some of those folks may have transferred over to the Spanner team. I wouldn't say that Spanner is particularly based off of MySQL anyway though, but I don't have enough familiarity to say that conclusively.

      * Aurora: AWS now offers a Postgres-based Aurora as well. As for why the built the MySQL one first, I'd assume that was likely a business decision based on mysql-vs-postgres RDS usage at the time.

      * Those two aside, there are definitely major products based around Postgres. AWS Redshift is one example. Or look at CockroachDB, which chose wire-protocol compatibility with Postgres. (There are some examples the other way too; e.g. TiDB, which chose wire-protocol compatibility with MySQL).

      * Regarding why MySQL has historically been popular, there are a lot of factors. I'd say the biggest one for large users has been replication; some of my older HN comments delve into that more, https://news.ycombinator.com/item?id=16880663 for one example. For smaller users, ease-of-use has been important. The other discussions in this thread hint at this, especially around silly simple things like the "exit" command.

    • ma2rten 6 years ago

      MySQL used to be more popular. It was the M in the LAMP stack, which virtually everyone used to use.

      It also used to be more scalable, because PG's replication story used to be poor. This is why Google and (I assume) Amazon used MySQL internally. This is all a long time ago, but I suspect this is the reason more people are familiar with MySQL than Postgres at these companies. Keep in mind that Spanner was developed and used internally at Google long before it was publicly announced.

      • mrep 6 years ago

        > This is why Google and (I assume) Amazon used MySQL internally.

        Amazon actually uses a lot of Oracle with some scattering of MySQL.

    • gwn7 6 years ago

      Probably accessibility.

      Because MySQL for some reasons is more accessible[1] to beginners, juniors, average developers, and other people who may prefer a lower barrier of entry over correctness/safety (e.g. full-stack devs, managers, people who don't have much time); who form the bulk of the community combined.

      It's the same for languages, frameworks, and other tech; as several people noted already.

      [1] Or maybe "was"; in which case the answer becomes "historical superior accessibility".

    • mistrial9 6 years ago

      from second-source information: in the late 90s, just about every database product in the world, from tiny to massive, was tested as a backend to various web sites. Scaling was still being invented. MySQL was simply the fastest (only by microseconds) to return a set of strings to a php page, and php was a fast development environment. Neither php nor MySQL had the rigor that many in the database world demanded, and that was ok. In fact that helped to make the MySQL/php combination fast.

      Once the herds picked MySQL (and php) the economies of scale started, and advantages grew. Meanwhile, the proper and safe competitors were generally disgusted (drama word but not entirely an exaggeration) but who cares about them, maybe even a weird positive to the new breed of web devs who were inventing the future.

    • jpalomaki 6 years ago

      For the Spanner and Aurora: MySQL has a lot less features. For example on the query side it has been really limited compared to Postgres. Maybe this was a plus in this case. Less features make it essier to optimize.

    • jitans 6 years ago

      There is who cares about his data and who uses MySQL

    • Keyframe 6 years ago

      MyISAM made it popular back then, and the rest of it is inertia.

    • johncolanduoni 6 years ago

      Probably because compatability with MySQL is more valuable than comparability with Postgres from a product perspective.

    • apapli 6 years ago

      I’m no expert on this topic and have used both PG and MySQL.

      My preference is MySQL purely because the syntax is easier, or at least consistent with what I learned at uni in the late ‘90s.

      I also like MySQL workbench, although I’m sure there is an equivalent for PG I haven’t needed to look for one in recent years.

      If PG allowed me to use the same syntax as MySQL I’d probably switch, purely because experts and those with more experience than I generally prefer PG for a whole lot of reasons.

      EDIT: I should have googled first before replying. Looks like the syntax is near enough to identical for my simple needs and pgadmin will do what I need. I’m not sure why I had to learn some other weird syntax in 2012 for postgres, perhaps it was just shortcuts. Either way I’m going to seriously consider migrating now. :)

      • MisterOctober 6 years ago

        I recently mostly switched to Postgres from MySql, and using Postgres really points up some of the odd behavior / 'features' of MySql -- e.g., its parsing but not enforcing check constraints, "USE database_name," disinclination to handle JSON data, etc.

        Also, Postgres' documentation is really fantastic. I bought a book on Postgres when I switched, but really I could have just stuck to the online manual and saved the thirty bucks.

        I like many things about MySql [as you imply, syntax a bit more verbose but easier to remember - e.g. "SHOW TABLES" vs "\d" ; I also like the nonstandard behavior of MySql 'GROUP BY' as it makes good sense to me ]-- but overall I'm really enjoying the flexibility of Postgres. PGadmin4, on the other hand, leaves much to be desired -- I generally stick to psql.

      • dtech 6 years ago

        MySQL workbench is still 10.000 miles ahead of PgAdmin. About three years ago the latest iteration PgAdmin 4 came out, which I tried on and off for a few months but was so buggy it was unusable for me. It might have gotten better since then.

        It's a shame, psql is awesome once you know enough to use it, but GUIs is how people start using RDMSes, and it's about the only thing that MySQL is clearly superior to PostgreSQL.

        • GordonS 6 years ago

          PgAdmin recently became an in-browser tool, and is much improved - in particular, it no longer uses GBs of memory and 90% of my CPU!

          • chinhodado 6 years ago

            pgadmin 4 is still horribly slow, both when starting up and during operation, horribly bloated in size (100mb compared to pgadmin 3's 10mb). It still hogs CPU and RAM like hell.

            • Something1234 6 years ago

              What ever happened to pgadmin3? Why did development stop on it? It was definitely a beast, with a lot of weirdness to it, and some minor pain points, but it was good. It was lightweight resource wise.

      • akavel 6 years ago

        What syntax do you have in mind? Can you give sone example of what you see as easier in MySQL than Postgres? Genuinely curious, I don't have money in that game

        • apapli 6 years ago

          I recall having to type /d from the command prompt to show databases (or something like that). I prefer “show databases”.

          • mrep 6 years ago

            Haha, same. When I first used a postgres database, I got super annoyed trying to figure out how to simply show the databases and tables as I kept trying to use show and describe.

          • petepete 6 years ago

            The standard way to do this would be to query `information_schema. tables`.

            PostgreSQL's approach isn't the most intuitive but it's well documented and very powerful. A few aliases could probably help, it's good they've added `quit`/,`exit` this time around.

        • mgkimsal 6 years ago

          i can say that setting up user accounts and permissions to get up and running is, to me, still easier under mysql, because you connect to mysql as root, and do the rest from inside.

          postgresql still seems to have 2 ways, but the majority of tutorials will demonstrate the "become the postgres user and add unix-level accounts". Things like having a unix-level "adduser" program that manages pg stuff is... odd to me, because it's not 'self-contained' in the app. It's more system-level admin stuff I need to do or be aware of.

          Whenever I bring this up, I get push back that I'm doing it wrong, and that of course you can just create user accounts from within pg itself via psql, and ... all other sort of attendant feedback. This approach had not seemed to be the norm/default approach, nor the one that was in multiple postgres books I had 10 years ago.

          I use pg on some projects, but am no expert in it.

  • alecco 6 years ago

    > “build it safe and then make it fast”

    I have a lot of respect for the PostgreSQL team. But I think they don't care about the second part, or the constraints of the first part make it impossible. A simple example, the performance of a simple un-indexed DISTINCT on a single string column on a single table, is still quite bad for millions of rows.

    Also to note, PostgreSQL has too many features for my taste. I get it's the selling point. But that usually correlates with poor performance and code complexity, too.

    Of course, there's a problem of all modern fast database engines being proprietary. But by not addressing the performance problem, perhaps with an smaller engine lighter in features, open source is left biting the dust. And that's not good for the industry.

    • jacques_chester 6 years ago

      > A simple example, the performance of a simple un-indexed DISTINCT on a single string column on a single table, is still quite bad for millions of rows.

      Why is this surprising?

      • IshKebab 6 years ago

        I assume he means bad in comparison to other DBs.

        • jacques_chester 6 years ago

          So do I. But it's a bit like criticising a car for being slow because you didn't push the accelerator pedal. That's not really the car's fault.

mmartinson 6 years ago

Thank you pg maintainers, you've made such a wonderful thing.

  • sergiotapia 6 years ago

    It really is incredible how we can get something as fundamental as a database for free that is so well made and production ready. Can you imagine if there was no postgres? We'd have to use mysql or oracle or gasp mssql.

    • SOLAR_FIELDS 6 years ago

      If MSSQL wasn't tied to the Microsoft ecosystem, I don't think it would get such a bad rap - When I was a dev on the Microsoft product stack I found MSSQL quite a joy to use - reasonably fast, pretty powerful admin tools (SSMS) and pretty easy to interface with as long as you stayed within the MS ecosystem. I did find myself missing SSMS specifically when I moved to non MS stack development, since Postgres' 'official' open source options aren't nearly as robust. There are a few proprietary tools that can rival SSMS for Postgres, but then you are back into the realm of proprietariness.

ahartmetz 6 years ago

I have a serious problem with Postgres. Every year at FOSDEM, some of the most technically interesting talks are about Postgres, and the room is always far too small so I can't get in.

  • adwhit 6 years ago

    The AW block is tiny but has so many great talks. All the queues end up merging with each other and jutting out into the foyer. Chaos!

  • twic 6 years ago

    Sounds like you need to VACUUM ANALYZE a few of the other attendees to make room ...

kbumsik 6 years ago

One thing the OP didn’t mention is that the JIT is not enabled by default. They found that it has performance improvements for only long and complex queries yet. [1] But it can be enabled by a simple configuration command: jit = on.

[1]: https://www.phoronix.com/scan.php?page=news_item&px=PostgreS...

  • anarazel 6 years ago

    It's expected to only have benefits for longer queries. The problem is primarily that the current logic what is likely to be a long query isn't perfect, so can trigger in the wrong moments. Since integrating the feature we'd added an item in the "pre-release item list" [1], to decide whether to enable/disable - but we definitely wanted it enabled during most of the beta test period. This wasn't really a last-minute "whaaaaaaa" thing...

    [1] https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items

shrumm 6 years ago

I’m already loving PG10’s native partitioning support. PG11 closes the loop on a few convenience features like being able to set foreign and primary keys at the master table level. With PG10 I can’t use ON CONFLICT clauses - looking forward to upgrading and sorting that out.

  • LunaSea 6 years ago

    I encountered the same limitations as you while implementing partitioning a few months ago and it's possible to use ON CONFLICT if the unique index is set on the partitioned table.

    So after initialising a new partition, I also create a unique index attached to that partition directly instead of the parent table.

    • shrumm 6 years ago

      wow thanks, I never thought to try it, just dismissed it and never dig further. I’ll try this!

  • welder 6 years ago

    Check out TimescaleDB, a Postgres extension that auto-creates partition tables for you while PG11 native partitioning you still need to manually run CREATE TABLE.

    https://news.ycombinator.com/item?id=18026699

    • shrumm 6 years ago

      would love to but this is a no-go at this point, since I’m on RDS. I think it’s great you’re solving this though - I currently use a cron job to create new partitions.

Nelkins 6 years ago

I'm excited for hash partitioning. Being able to partition on a uuid is going to be very useful.

dkubb 6 years ago

I love Postgres, but the only thing I find myself wishing for was better Unicode support in the regex engine. I like to add strong database constraints, and sometimes lean on the regexp engine but I find it's missing a lot of features available in other regexp engines. PCRE compatibility would be ideal, but even just support for the Unicode \p classes would be a big step up.

ggm 6 years ago

I love PG. heaps better than the alternatives, for many reasons. My main one, is native IPv6 object support.

I found the migration from 9.x to 10.x deeply painful, I think the one-time cost of the dump-restore cycle hurt me far more than I expected. I very much hope the transition to 11 can be done more expeditiously.

  • chillydawg 6 years ago

    For future reference, you can use logical replication to do this online between versions with a minimal window at the end for a final sync when apps are off and not changing data. By that I mean have your 9.6 in prod running as normal, set up your 10/11 newprod and then kick off logical replication for ALL TABLES from prod->newprod. That'll chug along and as long as you have appropriate resources and replication configured correctly, it'll replicate your data over and maintain it while prod is still being used. Then you turn your apps off, let it finish the sync, switch the apps over to the new db and you're good. Obviously there's some missing detail like users, extensions, replication etc, but that's always very specific to an install.

  • RedCrowbar 6 years ago

    Have you tried pg_upgrade?

    • ggm 6 years ago

      Yes. I still don't understand why it failed. It gave me a huge list of locale related reasons it felt unhappy, and since I hadn't selected a locale config option installing either the prior, or the current (9.4 -> 10.x) I felt pretty wierded out.

      • tourdownunder 6 years ago

        Postgres 9.4.0 was released 2014-12-18

        The version convention has changed with v10.0 so it should be trivial to change from 10.x to 11.x . In your position I'd be updating 9.4.x -> 9.5.x -> 9.6.x -> 10.x

deytempo 6 years ago

What are some reasons I might consider Postgres over MySQL when choosing a DB platform?

  • petre 6 years ago

    JSON support, ability to query it, JSONB support, real spatial extensions, timescale extensions, consistent behaviour, online backups. We now have to migrate from MySQL to Postgres because of these features. Instead of MySQL 8 we will be probably usng Pg 11. Now we are using MariaDB.

    • evanelias 6 years ago

      > JSON support, ability to query it

      MySQL 5.7 has this, and MySQL 8 improves on it further. Note that MariaDB's implementation is substantially different than MySQL's here.

      > consistent behaviour

      This one's a bit vague :) But generally using strict sql_mode and innodb_strict_mode go a long way towards alleviating this common complaint.

      > online backups

      Online physical backups can be created with Percona's xtrabackup, which is open source. Or for online logical backups, there's always mysqldump, or third-party concurrent options like mydumper.

      > real spatial extensions, timescale extensions

      Yeah these ones I definitely see Postgres having a substantial edge, if they're important to your application or workload.

  • jpalomaki 6 years ago

    Postgres has much better SQL support and has features which will make you smile if you end writing complex queries for some purposes (like reporting).

    • pmontra 6 years ago

      Yesterday a friend developer told me he was trying to code a query on MySQL to get the latest message of every conversation stored in a table like conversation id, timestamp, user id, message.

      I did that years ago so I told him he'll find a lot of solutions on Stackoverflow, ugly ones with MySQL and much better ones for PostgreSQL which has a nrow() function.

      A MySQL DBA solved that problem in that project I worked on years ago without queries. He used triggers to store the id of the most recent message of every conversation in another table. That is probably the most efficient solution in any database. My friend told me he wants to code that query, no triggers, not even an INSERT OR UPDATE from his PHP code. I replied to go on, he'll learn a lot of things and I hoped this will make him pick PostgreSQL next time.

      Developers tend not to like triggers or stored procedures IMHO because it's code outside their reach in a language most of them don't know, but that's another story.

      • masklinn 6 years ago

        > Developers tend not to like triggers or stored procedures IMHO because it's code outside their reach in a language most of them don't know, but that's another story.

        You can write PG stored procedures in close to a dozen languages between the builtins (pgSQL, Tcl, Perl and Python) and third-party (Java, Ruby, PHP, R, Scheme, JS, …).

        The language itself is not why developers don't like stored procedures, it's because stored procedure are an even worse version of image-based languages (like Smalltalk or Self, except they had much better UI, tooling and interaction model): hard to version, hard to update, hard to test, hard to debug, hard to cache, hard to scale, …, and your wide-spread well-supported ancillary tooling goes away pretty much entirely.

      • tormeh 6 years ago

        Stored procedures are plaintext code running on production servers and is, in general, never cleanly redeployed. So by default they discourage source control. This is my primary problem with them. Also that (at least MySQL) has shitty error messages when you're trying to build/debug the things.

        • ht85 6 years ago

          Maintaining and migrating stored procedures can be a pain, especially when you're running a single tenant infrastructure, and even more if you don't migrate everyone at the same time.

          I've had a lot of success by creating a small build system for the database logic, that can be as simple as running a bunch of SQL files in order.

          The most important aspect is for the build to be "pure", rebuilding the whole thing every time instead of incrementally changing the schema.

          A practical way to do this is to have all your stored procedures in a separate schema from your data, and starting your build process with `DROP SCHEMA procs CASCADE`.

      • Avamander 6 years ago

        I've had to debug a stored procedure, they're horrible when you have to do that. Just a big headache. I literally rewrote every single one of those procedures in the software's primary language, the code can now be easily debugged, logged, replaced and read.

        • pmontra 6 years ago

          I'm not a big fan of PL/SQL. Almost nobody is, so your approach is standard.

          It's ok if all accesses to the database go through the primary application. Triggers and stored procedures make more sense when a database serves multiple applications possibly in multiple languages. Think of the db as the microservice in front of the data store. Simple stored procedures are no that bad.

      • twic 6 years ago

        Given:

            create table conversations (
              message_id integer primary key,
              conversation_id integer,
              timestamp timestamp,
              user_id integer,
              message varchar
            );
        
        Then:

            select distinct on (conversation_id) *
            from conversations
            order by conversation_id, timestamp desc;
        
        Should do it. Although i believe "select distinct on" is a PostgreSQL extension rather than standard SQL.

        Now, what i don't know is how this is planned for a large table, or what indices would let you get a sensible plan. Maybe just an index on (conversation_id, timestamp)?

        • tfehring 6 years ago

          Extension or not, that's a really useful shortcut. Coming from SQL Server, the best option I know of requires a CTE or subquery:

              WITH CTE_messages AS
              (
                  SELECT
                       *
                      ,row_nbr = ROW_NUMBER() OVER(PARTITION BY conversation_id
                                                   ORDER BY timestamp DESC)
                  FROM
                      conversations
              )
              SELECT
                  *
              FROM
                  CTE_messages
              WHERE    row_nbr = 1
              ORDER BY
                  conversation_id
          
          This does give a performant plan in SQL Server with an index on (conversation_id, timestamp), for what it's worth. I imagine the same is true for this syntax in PostgreSQL. But SELECT DISTINCT sometimes does strange things to execution plans in general, and I wouldn't be surprised if that extends to SELECT DISTINCT ON.
      • blattimwind 6 years ago

        > He used triggers to store the id of the most recent message of every conversation in another table. That is probably the most efficient solution in any database.

        Depends on how often that last message query is run. If it's infrequent, the trigger is just write amplification.

  • gaius 6 years ago

    With Postgres you don't need MongoDB, InfluxDB, or any other trendy thing, Postgres does it all, and better than all the wannabes.

    • sz4kerto 6 years ago

      Is that really true? Postgres is great, but it doesn't cater for _all_ use cases. For example, horizontal scalability is -- understandably for an ACID RDBMS -- not as trivial as it is for some document stores.

      • gaius 6 years ago

        Check out what Citus are doing with PG https://www.citusdata.com

        • threeseed 6 years ago

          That's a fork of PostgreSQL.

          So it doesn't help you when it comes to using cloud hosted services e.g. AWS or Azure. It's also a commercial product that isn't exactly cheap ($890/node).

          • stareatgoats 6 years ago

            It's not a fork (according to their promotion), it's an 'extension', but not sure I can explain the difference. Their solution looks interesting but pricing wasn't very visible on their website, what you quote is prohibitive for me (where does it say?). But I at least assumed the 'open source' would be free.

            • threeseed 6 years ago

              Pricing is available here: https://www.citusdata.com/pricing

              And the point still stands that this isn't core PostgreSQL. So you can't use managed services like AWS RDS with this. And you're reliant entirely on the vendor for support.

              So from the user perspective it is basically a fork.

              • manigandham 6 years ago

                PostgreSQL is completely open-source with no backing company so everyone is reliant on some vendor if they want support. It makes no difference that Citus is one of them, with some core PG devs onboard.

                Why would you talk about AWS RDS and then skip over Citus Cloud which is a managed service? If you consider this a fork then any plugin for PG is also a fork, which then makes it hard to have a productive discussion.

                • threeseed 6 years ago

                  The reason I am talking about AWS RDS is because most companies aren't multi-cloud. Data needs to be housed and secured within that private VPC. And so Citus Cloud is irrelevant because it is an externally housed product.

                  If Citus allowed their proprietary product to be run within existing clouds that could be something but it isn't. And again since it is a significant add-on i.e. beyond just a simple plugin that almost demands vendor support it is effectively a fork.

                  • manigandham 6 years ago

                    What does RDS have to do with being multi-cloud? It's a hosted offering which only works in AWS. And Citus Cloud does offer VPC connectivity so you can access it privately. As stated before, the extension is also open-source and you can run it in any cloud or datacenter yourself, for free. Why would you claim that's not allowed?

                    You seem to be stuck on a strange intrepretation of vendor support when PG has no backing vendor, so you can pick whichever one you want. If you aren't comfortable with Citus and prefer AWS then that's fine, but you make the trade-off in the features as well. Not every managed postgres service chooses to support every PG extension so it's nothing unique in that regard, and is ultimately no different than the choices involved in picking any other database with vendor backing.

                    You've posted several comments now that seem to show that you do not know much about Citus or are willingly misleading people about it. I'm not sure why but perhaps it would be better if you read up on the details or disclosed your angle more clearly.

              • stareatgoats 6 years ago

                OK thanks for this. The product seems promising even if it is not core PG. I'd have to dive into the deep end and test drive to check if the trade-offs would be worthwhile. The 'open source version' seems free, so I guess I could at some point.

            • manigandham 6 years ago

              An extension means you're just running normal Postgres and load this on top. All of the normal features of Postgres are available and you can upgrade that as new versions are released without waiting for some other vendor to update their own specialized version of it. And yes, Citus is open-source and free for you to run yourself.

          • mslot 6 years ago

            Citus is an open source plug-in that you load into vanilla PostgreSQL, similar to PostGIS and other extensions.

    • Avamander 6 years ago

      Could you please elaborate how to use Postgres as a time series database (like InfluxDB)?

      • manigandham 6 years ago

        There is absolutely nothing special about "time-series" to be an actual type of database. It's all hype.

        Time-series data is data that has a time component which is usually the primary property to query by. Almost every database can handle this, like MongoDB/Redis for lightweight use, Cassandra for write-heavy/global replication, ElasticSearch for raw search-style quering, or an OLAP columnstore (Redshift, MSSQL, Snowflake, Clickhouse) for serious ingest and querying. Monitoring systems like Prometheus and Netdata even have time-series storage built-in because it just isn't that hard.

        InfluxDB is only really useful in the context of the integrations that it provides with the common monitoring software stacks. You can easily just create a regular relational table with "time" as a column and get great performance with an index, and then use partitioning to break up the table to get great performance over lots of data.

        pg_partman is an extension that makes partitioning automatic. Timescale is an extension that makes time-focused partitioning automatic. Citus is an extension that makes partitioning across multiple nodes automatic. Or use one of the distributed OLAP systems mentioned above.

      • StreamBright 6 years ago

        Create a table and have a timestamp for your data? Not sure what you are asking.

        On hand we have: 15+ years old mature product that is widely supported and it can be used as a time series database

        Other hand: n+1 young database engine with all of the problems that any young data storage engine has (only exception is FoundationDB because those guys actually understood the challenges of writing a reliable data store)

        The question is, what can InfluxDB offer over Postgres for storing time series data?

        • threeseed 6 years ago

          Well InfluxDB actually is a time series database. That's all it does.

          The query language is optimised for time based queries. The storage is optimised for time series. It is part of the metrics/monitoring ecosystem so it has integrations with tools like Grafana.

          PostgreSQL doesn't have any time series capabilities OOTB. It's just people modelling the concept in a relational fashion. No different to how Excel can be used as a time series database.

          And seriously everyone should just get over FoundationDB. There are plenty of people who have invented new databases that got it right e.g. Cassandra, HBase, Redis etc.

          • mslot 6 years ago

            PostgreSQL has excellent time series capabilities. It can load millions of rows per second, efficiently scan by time range, build rollup tables, has expressive SQL with excellent support for time (timezones, ranges, timestamps, intervals, conversion, etc.), it can combine multiple indexes to query large volumes of time series data quickly, it can do sampling, and it can expire data efficiently through partitioning. Ok, it's not entirely ootb. To make it work nicely you may need extensions like pg_partman to automate partitioning and Citus to scale out, but once you do you have a time series database that's faster and more powerful than anything else on the market.

            • threeseed 6 years ago

              Congratulations. You just described the capabilities of ANY SQL database.

              But again there are capabilities that exist only in InfluxDB because all it does is time series data. It's not multi-purpose. Which again is why you see it all over the place in the metrics/monitoring ecosystem but you never see PostgreSQL, MySQL etc.

              • xdanger 6 years ago

                Actually you can use PostgreSQL in those with timescaledb plugin. ;)

    • threeseed 6 years ago

      This is simply untrue.

      MongoDB is a much better JSON store. It's ridiculously faster, has much richer update semantics and it's simply not comparable when it comes to the ease of clustering.

      As for time series unless I am confused but PostgreSQL doesn't have any OOTB capabilities.

      • gaius 6 years ago

        MongoDB is a much better JSON store

        That hasn't been true for a long time, if it even ever really was when you consider that the key role of a database is to reliably persist your data https://www.enterprisedb.com/node/3441 (2014)

        • threeseed 6 years ago

          Think about this for a second. You just posted a link that is over 4 years old.

          It's completely irrelevant. Especially since I am referring to version 3.0+ where they completely replaced their entire engine with WiredTiger. And from that point on the performance for tuple level updates for example is faster than I've seen from ANY database.

          And this nonsense about reliable persistence of data was a case of bad defaults many, many years ago. It was also fixed many, many years ago and so these tired memes just make you look bad.

          • gaius 6 years ago

            You just posted a link that is over 4 years old.

            Yes, to make my point that it hasn't been true "for a long time". Yes Mongo had JSON support before PG did, but PG soon surpassed it.

            these tired memes just make you look bad

            Nah, they make me look like what I am, an extremely experienced data guy. Maybe Mongo has some use cases in webdev, I wouldn't know.

  • sv12l 6 years ago

    If data is central to your business, in all likelihood it will be, and you care.

    • calcifer 6 years ago

      This is not a useful or constructive comment with its unstated argument of "only idiots use mysql". It's sarcastic flamewar bait.

      • guiriduro 6 years ago

        No, its just the TL;DR version of a longer argument demonstrating MySQL's ability to lose, truncate and misinterpret data and ignore commands and thereby create false expectations (like, your data is validated and safe). PostgreSQL takes far greater care over data.

brightball 6 years ago

Great looking release. I’m really excited to see if the cstore extension can make good use of the parallelism improvements.

  • macdice 6 years ago

    It looks like someone would need to do the things described in here: https://www.postgresql.org/message-id/flat/CA%2Bz6ocRFEnThhX...

    There is an easy level "parallel safe" that would allow scans of different cstore partitions in a parallel query, and a harder-to-code "parallel aware" level that would allow parallel scans on one individual cstore_fdw relation. AFAIK no FDW has attempted parallel scans yet, but cstore looks like it may be an ideal case to be first?