“””
That's right, if you attempt to parse HTML with regular expressions, you're succumbing to the temptations of the dark god Cthulhu's … er … code.
“””
Of course, if it's anything like HTML, the formatting will vary over time that you really want a more permissive parser like BeautifulSoup. I haven't found a cli interface, so I briefly wrote my own ages ago: https://github.com/jldugger/dotfiles/blob/master/bin/select.....
Do NOT use r̔ͩegͩ̾ͪͥͪeͮ͊ͨ̓xͫ͆͆̓ ͤt͊͗o̒̾͋ͬ̾̚ ͆̌͌̄p͠a͟r̐̎͆̽̄ͭ́se̒ͥ ͕̪̻̭̭̺̳ͣͬͪͫͪj̞̲ͤ̚s̡̳̟̤̳̖̤͒̋ͣ͊ͤ͗̿oͬ̀͆n̮̳͚̝̩͙͔̈͋ ̮͕̩̼̔̾̈̄̋ͦ́́̚ͅĤ̷̝̯̝ͯ̈́ͣ̔ͪ̊ͬ͜͡Eͬͧ҉̜̰̲̩̰̝̠̥ ̶̯̯͚̗̪̭̘ͨͩͭ̎ͧͮCͪ҉̖͍͔͚͚̯͕Ơ̰̻̂̅̋̇̓̅͌M̸̭̱̭̥͆̽ͨͦÊ̸̴̢̪͚̮̲̜̙̍ͤ͋̾ͦS̛̗̟͙͍̹͈̳ͣ͑̏̓ͤͦ̽
Yeah logger or fluentd are definitely better long term solutions. I wanted something that was no frills and minimal config for ad-hoc situations. Other's on my team found it useful so I open sourced it.
When you run this utility, it forwards stdin to Google Sheets.
Should you pipeline stdout from somewhere else to this utility, the contents of that stdout will be forwarded to Google Sheets.
It is correct to say either that you are forwarding stdin or that you are forwarding stdout. It is therefore not correct to say that it does not forward stdin, because the source code explicitly refers to stdin. That's likely the reason for the [Fight Club]s the GP is receiving.
This is pretty great. It looks like it automatically writes to the next logical empty cell.
^ That will append the last BTC price to column A of sheet 'foo'Can I nerd out over how unreasonably effective regexps are?
That's basically a mini JSON parser in 48 characters.
With a domain-specific tool, it's even easier though.
Sure. But you can't use jq to scrape arbitrary websites, for example. :)
Jeff Atwood has an entertaining post about parsing HTML with regular expressions:
https://blog.codinghorror.com/parsing-html-the-cthulhu-way/
“”” That's right, if you attempt to parse HTML with regular expressions, you're succumbing to the temptations of the dark god Cthulhu's … er … code. “””
Let's not forget about this masterpiece: https://stackoverflow.com/a/1732454/864310
Indeed, its quality cannot be ignored and must be shared; it’s referenced in the Atwood post.
Parsing and scraping are different things though. You don't need to parse a web page to extract specific things from it.
Of course, if it's anything like HTML, the formatting will vary over time that you really want a more permissive parser like BeautifulSoup. I haven't found a cli interface, so I briefly wrote my own ages ago: https://github.com/jldugger/dotfiles/blob/master/bin/select.....
For cases where a website is not a tutorial for websites, regex is a suitable tool for scraping.
Do NOT use r̔ͩegͩ̾ͪͥͪeͮ͊ͨ̓xͫ͆͆̓ ͤt͊͗o̒̾͋ͬ̾̚ ͆̌͌̄p͠a͟r̐̎͆̽̄ͭ́se̒ͥ ͕̪̻̭̭̺̳ͣͬͪͫͪj̞̲ͤ̚s̡̳̟̤̳̖̤͒̋ͣ͊ͤ͗̿oͬ̀͆n̮̳͚̝̩͙͔̈͋ ̮͕̩̼̔̾̈̄̋ͦ́́̚ͅĤ̷̝̯̝ͯ̈́ͣ̔ͪ̊ͬ͜͡Eͬͧ҉̜̰̲̩̰̝̠̥ ̶̯̯͚̗̪̭̘ͨͩͭ̎ͧͮCͪ҉̖͍͔͚͚̯͕Ơ̰̻̂̅̋̇̓̅͌M̸̭̱̭̥͆̽ͨͦÊ̸̴̢̪͚̮̲̜̙̍ͤ͋̾ͦS̛̗̟͙͍̹͈̳ͣ͑̏̓ͤͦ̽
Please don't do this here.
I wrote something similar that will send stdin to Google Stackdriver Logging.
https://github.com/thesandlord/logpipe
In the same vein, pipe anything to `logger` to send it to the local syslogd (and if configured, to remote destinations).
For example, this uses datamash (https://www.gnu.org/software/datamash/) to sum column 2 of a CSV ("-t,"), then log the output:
Yeah logger or fluentd are definitely better long term solutions. I wanted something that was no frills and minimal config for ad-hoc situations. Other's on my team found it useful so I open sourced it.
Bonus: In depth code review for this package: https://youtu.be/c5ufcpTGIJM
Amazing.
We need more utilities like this that help integrate apps and web services between themselves and the Unix environment.
"Pipes to the web" should be like this. Not bloted a UI with forms that perform premade actions with your web apps data.
They mean stdout, right?
Depends on from where you look :)
It is stdout for the previous process in the pipeline and stdin for the tosheets itself.
EDIT: Tosheets -> tosheets
tosheets just reads stuff from stdin. It doesn't care if it's attached to a pipe or not. It might be that there's no "previous process" at all.
No.
From "tosheets"'s stdin to sheets.
very cool
Interestingly Nice
Your project ruins my English
Nope, it sends stdout to sheets. Getting input from sheets would be cool, too.
Consider the command:
In this example, stdin is what goes to the spreadsheet. stdout is (I assume) nothing.tosheets definitely reads from stdin: https://github.com/kren1/tosheets/blob/8a853d2ebb722c474f5bb... :)
It looks to me like you're being downvoted, but what you're saying looks correct to me. $PROGRAM sends to stdout what google sheets reads from "stdin"
When you run this utility, it forwards stdin to Google Sheets.
Should you pipeline stdout from somewhere else to this utility, the contents of that stdout will be forwarded to Google Sheets.
It is correct to say either that you are forwarding stdin or that you are forwarding stdout. It is therefore not correct to say that it does not forward stdin, because the source code explicitly refers to stdin. That's likely the reason for the [Fight Club]s the GP is receiving.
From the HN guidelines:
Please don't comment about the voting on comments. It never does any good, and it makes boring reading.