Notes about open source software, computers, other stuff.

Tag: programming (Page 1 of 3)

Bulk downloading and renaming of Expensify PDF reports

In my company, we have been using Expensify to manage small receipts, travel expenses, etc. Recently, however, I decided to switch to another platform that is part of the SAAS platform that our accountant uses. Even though it lacks some of the functionality provided by Expensify, having all receipts in a single location reduces the amount of time I have to spend on administrative tasks.

Every quarter Dutch companies have to file a VAT report, which meant I exported the Expensify reports to CSV files (to send to my accountant) and in PDF form, as a more “visual” backup, which lists the reported expenses, sorted in categories, and, importantly, also includes the scans on the various receipts.

As we changed accountants a couple of years ago, I wasn’t sure whether I had actually downloaded both the CSV and the PDF file for each Expensify report. Keeping records is required by Dutch law, so I decided to make sure and download all PDF files and back them up somewhere.

Unfortunately, the Expensify website doesn’t offer an option for bulk downloading of the PDF files. They do offer a kind of REST API (they call it the Integration Server), that I had played with years ago, so I decided to try that. Luckily, the credentials I had saved in my password manager still worked.

The process for downloading the PDFs consists of two steps:

  • Run a command to generate the reports, this returns the file names for the PDF files.
  • Use those names to download the PDFs

The first step took a couple of minutes to run and then listed the filenames for the PDF on stdout:

curl -X POST 'https://integrations.expensify.com/Integration-Server/ExpensifyIntegrations' \
    -d 'requestJobDescription={
        "type":"file",
        "credentials":{
            "partnerUserID":"XXXXXXXXXX",
            "partnerUserSecret":"YYYYYYYYYY"
        },
        "onReceive":{
            "immediateResponse":["returnRandomFileName"]
        },
        "inputSettings":{
            "type":"combinedReportData",
            "filters":{
                "startDate":"2013-01-01"
            }
        },
        "outputSettings":{
            "fileExtension":"pdf",
            "includeFullPageReceiptsPdf":"true"
        }
    }' \
    --data-urlencode 'template@expensify_template.ftl'

I’m not sure what the expensify_template.ftl file does in this command, but it was necessary to create that file locally, otherwise the curl call would return an error. I simply copied the example from the sample provided in the documentation for the Expensify Integration Server. I made a copy of the long list of PDF filenames output by the above command. A typical filename would look like this: exportc992bd79-aa4a-4b04-a76a-1149194bac94-34589514.pdf. Not very descriptive… As expected (confirmed in the web UI), there were 191 file names.

Next, step two: actually downloading the files. The basic call for that is:

curl -X POST 'https://integrations.expensify.com/Integration-Server/ExpensifyIntegrations' \
    -d 'requestJobDescription={
        "type":"download",
        "credentials":{
            "partnerUserID":"XXXXXXXXXX",
            "partnerUserSecret":"YYYYYYYYYY"
        },
        "fileName":"exportc992bd79-aa4a-4b04-a76a-1149194bac94-5803035.pdf",
        "fileSystem":"integrationServer"}
    }' \
    --data-urlencode 'template@expensify_template.ftl' --output "my_output.pdf"

So, in order to download all PDFs, I saved all file names in the file pdflist. All PDF file names are unique:

$ wc -l pdflist
191 pdflist
$ sort pdflist| uniq | wc -l
191

Next, I used a loop to read each line from the pdflist file and fiddled a bit with the quotes so I could use the pdf variable in the Curl call and download each file:

cat pdflist | while read pdf; do
curl -X POST 'https://integrations.expensify.com/Integration-Server/ExpensifyIntegrations' \
    -d "requestJobDescription={
        'type':'download',
        'credentials':{
            'partnerUserID':'XXXXXXXXXX',
            'partnerUserSecret':'YYYYYYYYYY'
        },
        'fileName':${pdf},
        'fileSystem':'integrationServer'}
    }" \
    --data-urlencode 'template@expensify_template.ftl' --output ${pdf}
done

This indeed gave me 191 Expensify report PDFs, with very uninformative names 😐 . To fix that I resorted to some more shell “scripting”. Every report has a title (usually something like “Small expenses 2020 Q4”) and by using the pdftotext utility, it looked like this was always on the third line of the pdftotext output. So I moved the original PDFs to a separate “archive” directory OriginalExports and ran the following to make a copy of each PDF to a new name that was equal to its title. My first attempt failed somewhat, because the number of renamed PDF files as smaller than the number of original PDFs. I guessed this would happen when two reports have the same name, and indeed, adding -i to the cp command to warn me of this showed I was right. As this was only happening for four files, I manually converted those.

for pdf in OriginalExports/export*.pdf; do
echo ${pdf}
title=$(pdftotext ${pdf} - | head -3 | tail -1 | tr "/" "_")
cp -i ${pdf} "${title}.pdf"
done

So there I had my backup of all receipts since we started using Expensify. And if the tax office or the accountant ever want to see those receipts, I am now sure I can provide them.

Related Images:

Upgrading nodejs to the latest LTS release on Ubuntu 21.10

Today I upgraded the Bash language server (to v3.0.3), after which I noticed that it stopped working. When loading a .bash file, the language server didn’t load and told me to look in the error output for more information. In Emacs, the errors of the Bash language server can be found in the *bash-ls::stderr* buffer, which showed me:

/home/lennart/.emacs.d/.cache/lsp/npm/bash-language-server/lib/node_modules/bash-language-server/node_modules/vscode-jsonrpc/lib/common/linkedMap.js:40
        return this._head?.value;
                          ^

SyntaxError: Unexpected token '.'
    at wrapSafe (internal/modules/cjs/loader.js:915:16)
    at Module._compile (internal/modules/cjs/loader.js:963:27)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Module.require (internal/modules/cjs/loader.js:887:19)
    at require (internal/modules/cjs/helpers.js:74:18)
    at Object.<anonymous> (/home/lennart/.emacs.d/.cache/lsp/npm/bash-language-server/lib/node_modules/bash-language-server/node_modules/vscode-jsonrpc/lib/common/api.js:37:21)
    at Module._compile (internal/modules/cjs/loader.js:999:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)

I re-ran lsp-install-server, which pointed out that I had nodejs v12.22.5 installed and the language server required v14 or higher.

Time to figure out how to install a newer nodejs version on my Ubuntu 21.10 machine. It turns out that v12 is no longer maintained. The current LTS version of nodejs is v16. Here I found instructions on how to install a given version of nodejs on Ubuntu. For v16, this boils down to running

curl -sL https://deb.nodesource.com/setup_16.x | sudo bash -

The script that this command fetches (and executes as root) is quite elaborate, but in the end it simply creates the file /etc/apt/sources.list.d/nodesource.list, with the following contents:

deb [signed-by=/usr/share/keyrings/nodesource.gpg] https://deb.nodesource.com/node_16.x impish main
deb-src [signed-by=/usr/share/keyrings/nodesource.gpg] https://deb.nodesource.com/node_16.x impish main

After that, a simple apt upgrade didn’t suffice. The nodejs upgrade was held back because of a dependency problem. Even an explicit upgrade of the nodejs package didn’t work:

$ sudo apt upgrade nodejs
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies.
 libnode72 : Conflicts: nodejs-legacy
E: Broken packages

So, I resorted to a full apt dist-upgrade, which worked fine. After that, I reopened a Bash script and all was fine.

Related Images:

Configuring Org2blog

Yesterday I installed Org2blog, which allows me to write my blog posts in Emacs org-mode and push them to my WordPress blog from within Emacs. So far I like it a lot! One less reason to leave Emacs :-), and hopefully also a reason to blog more often. Other good things about keeping your blog posts in Emacs are:

  • You can simply export them to e.g. PDF. In my current setup it’s a easy as adding the line

    #+LATEX_CLASS: lckartcl
    

    somewhere at the top of the file (before the actual text of the post starts) to tell org-mode that it should use my personal LaTeX export style, followed by C-c C-e l o and a nicely formatted PDF of my blog post pops up.

  • You keep all your blog posts in plain text format, so if you would decide to change to a different blogging platform, uploading the old posts should be fairly easy.

Org2blog’s GitHub page mentions C-c p as prefix key for Org2blog’s functions, but in my case this prefix is already used by Projectile, and looking in Org2blog’s Customize Group I noticed that C-c M-p is an alternative prefix, so I’m using that to get the following functionality:

C-c M-p p publish buffer
C-c M-p P post buffer as page and publish
C-c M-p d post buffer as draft
C-c M-p D post buffer as page draft
C-c M-p t complete category

This is the Org2blog configuration in my .emacs file (note that I’m using John Wiegley’s use-package macro):

;;;;;;;;;;;;;;;;;;;;
;; Configure Org2blog, which allows me to write blog posts in org-mode
;; and then push them to my WordPress blog.
(use-package org2blog
  :config
  (require 'org2blog-autoloads)
  (setq org2blog/wp-blog-alist
        '(("blog.karssen.org"
           :url "https://blog.karssen.org/xmlrpc.php"
           :username "xxxxxx"
           :default-title "New blog post"
           :default-categories "Linux"
           :tags-as-categories nil)))
  )

Related Images:

Using Magit to commit only some of the changes in a file

As I discussed here, git allows you to commit only some of the changes you made to a given file. If you are working in Emacs you probably already know the wonders of Magit. In order to do the same partial committing of a file you can simply open magit-status and go to the file you’re interested in. This will highlight the changed parts of the text. With your cursor in the changed block you’d like to commit simply press s and that change will be staged. If this is all you want press c to commit and you’re done!

Source

Related Images:

DatABEL v0.9-6 has been published on CRAN

This morning version 0.9-6 of the DatABEL R package was published on CRAN. This is only a minor update that consists of a few small changes and one bug fix. See the official announcement for more information.

DatABEL is an R package that allows users to access files with large matrices (of several gigabytes or more in size) in a fast and efficient manner. The package is mainly used for genome-wide association analyses using e.g. ProbABEL or OmicABEL.

Related Images:

Git: commit only some of the changes in a file

If you only want to commit some of the changes to a file in a Git repository, use

git add --patch your_changed_file

This will interactively ask you which lines to keep:

$ git add --patch .emacs
diff --git a/.emacs b/.emacs
index d903495..5a0eb9e 100644
--- a/.emacs
+++ b/.emacs
@@ -69,9 +69,9 @@
 
 ;;; Make better buffer names when opening files with the same name
-(when (autoload 'uniquify "uniquify" "uniquify" t)
+(when (require 'uniquify nil 'noerror)
   (setq uniquify-buffer-name-style 'post-forward-angle-brackets)
-  )
+)
 
Stage this hunk [y,n,q,a,d,/,K,j,J,g,s,e,?]?

Source and more information on StackOverflow.com.

Related Images:

Getting the version of a remote SVN repository via SSH

A quick note to self: I wanted to find out what Subversion version was run on R-forge, which I access via SSH. This is how to do it:

$ ssh username@svn.r-forge.r-project.org svnserve --version
svnserve, version 1.6.17 (r1128011)
   compiled Nov 20 2011, 01:10:33

Copyright (C) 2000-2009 CollabNet.
Subversion is open source software, see http://subversion.apache.org/
This product includes software developed by CollabNet (http://www.Collab.Net/).

The following repository back-end (FS) modules are available:

* fs_base : Module for working with a Berkeley DB repository.
* fs_fs : Module for working with a plain file (FSFS) repository.

Cyrus SASL authentication is available.

Related Images:

ProbABEL v0.4.4 released

It was quite a long time in the making and then a bunch of other stuff came in between, but I finally managed to release v0.4.4 of ProbABEL!

ProbABEL is a toolset for doing fast, memory (RAM) efficient genome-wide regression tests.

This is a bugfix release, but a major one for those who use the Cox proportional hazards regression module. Thanks to some of our users on the GenABEL forum, a serious bug leading to way to many NaN’s in the output was discovered, fixed and tested. This is one of the best examples of community collaboration I have seen in the GenABEL project.

Another bug fixed in this release is one that caused a failed install on MacOS X and FreeBSD. Again a bug reported on the forum by one of our users. Great work!

Uploads to Debian and the Ubuntu PPA are coming ASAP.

Now, let’s get ready for a new feature release, which will include p-value calculation (a long-standing feature request) and major speed-ups (implemented by former colleague Maarten Kooyman). Time to get to work ;-)!

Related Images:

Changing the default mode of the Emacs scratch buffer

After starting Emacs you end up in the *scratch* buffer (assuming you’ve disabled the startup message in your .emacs file). The *scratch* can be used for writing down notes and some Lisp experiments (since it uses the Emacs Lisp major mode by default).

Now, I’m not very much of a Lisp programmer, but I do use Org-mode a lot. Consequently, I found myself changing the buffer’s major mode to org-mode regularly. And Emacs wouldn’t be Emacs if you couldn’t change this to a default. So, thanks to Bozhidar Batsov over at Emacs Redux, I’ve added the following lines to my Emacs configuration file:

;; Set the default mode of the scratch buffer to Org
(setq initial-major-mode 'org-mode)
;; and change the message accordingly
(setq initial-scratch-message "\
# This buffer is for notes you don't want to save. You can use
# org-mode markup (and all Org's goodness) to organise the notes.
# If you want to create a file, visit that file with C-x C-f,
# then enter the text in that file's own buffer.
 
")

Related Images:

Implicit make rules and linking to libraries

Note to self: If relying on implicit make rules, then the libraries you want to link to need to go into the LDLIBS variable, not in the LDFLAGS variable.

The case at hand: I wanted to do a quick test on how to write gzipped files using the Boost libraries. Because this was a simple example, I also wanted a simple Makefile to accompany it, meaning I wanted to use implicit rules.

Here’s the example C++ code I used, slightly modified from the Boost example:

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
 
namespace io = boost::iostreams;
int main()
{
    using namespace std;
 
    ifstream infile("hello.txt", ios_base::in | ios_base::binary);
    ofstream outfile("hello.txt.gz", ios_base::out | ios_base::binary);
    io::filtering_streambuf<io::output> out;
    out.push(io::gzip_compressor());
    out.push(outfile);
    io::copy(infile, out);
 
    return 0;
}

The accompanying Makefile looks like this:

CXXFLAGS=-I/usr/include/boost
LDLIBS=-lboost_iostreams -lboost_system -lstdc++
 
# Needed because otherwise cc is used, in which case -lstdc++
# must be added to -LDLIBS
#CC=g++
 
PROGRAM=boost_write_gzip
 
$(PROGRAM): $(PROGRAM).o
 
clean:
	$(RM) $(PROGRAM).o $(PROGRAM)

Notes

  • Note the addition of -lstdc++ to the LDLIBS, this is because the implicit rule uses cc to do the linking. This is no problem for C++ code, as longs as you add the C++ standard library. Alternatively, you can set CC=g++ as shown in the comment, instead of adding -lstdc++.
  • Note that somewhere since Boost v1.50 the addition of -lboost_system is required.
  • This was done on a machine with Ubuntu Linux 13.10 installed, Boost version 1.53 (the libboost-all-dev package).

Links:

Related Images:

« Older posts

© 2024 Lennart's weblog

Theme by Anders NorénUp ↑