All posts tagged #imadethis all posts
For local development, I like to change some things in my log4j2 configuration. Make lines shorter, e.g. no worker names (just pid) or dates (just times). Prevent all logging to files, just console. Use colors for errors and warnings so they stand out. Filter out specific types of messages, and keep only specific other ones at debug level. Use buffering and async for better performance. So here it is! <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Configuration monitorInterval="30"> <Appenders> <Console name="Console" target="SYSTEM_OUT" bufferSize="100">... full post»
Java streams have findAny and findFirst methods out of the box. And they works fine, they do exactly what they say: it gets one of the matches. But in my experience, often I don't want 'one of the matches', I want there to be a single match, and I want that one. The problem For example, finding the instances in a stream of customers var formExpressionInstance = getCustomers().stream() .filter(instance -> isDateInRange(ferenceDate, instance.getDateStart(), instance.getDateEnd())) .findFirst() If there are multiple matches, maybe... full post»
It's sometimes suggested -perhaps jokingly- that we should use binary to count using fingers, as we'd get all the way from 0 to 1023! Much better than the 0 to 10 we get 'normally'. Of course, it's rather difficult for most people to know at a glance which number is represented by .|..| ||.|. (probably 256+32+16+8+2=314). This way of counting is most often suggested by programmers, because using this way is just using base 2, just like computers do. A... full post»
How my thinking about programming changed since I started out
Page I made to see if English is still readable without word order and postfixes?
Recently, I've been interested in designing software to make hard-to-find bugs reveal themselves at compile-time. An idea I've read about (I forgot where) that I found fascinating is the use of a special type for IDs in objects that represent database rows. Problem For example, there may be a User table and a Posts table in the database, with each post belonging to one user. Each of these would be mapped to a class/struct in the programming language. Perhaps each... full post»
First, booleans Booleans cannot be assigned to any other types, and you can't do any arithmetic on them. So true doesn't count as 1, nor does 1 count as true. No truthy and falsy in Java. Converting: a = b Some types can only be converted if you cast explicitly. These are usually the cases where a type with a wider or different range is converted to one with a narrower range. If you use explicit casting, you might be... full post»
During the Kaggle Data Science Bowl 2017, the leaderboard was based on only $198$ samples. The opportunity for overfitting was quickly understood, but initially only the naive option was mentioned, testing 1 submission per sample taking 66 days (still doable within the competition duration, but less than ideal). But then Oleg Trott got a perfect score in just 14 submissions! topic) I was really curious how he managed to do this. Together with Cas, I found out one way it... full post»
Linear models vs decision trees I’ve used the R statistical language a bit, a long time ago. It was my first real encounter with data science, but future encounters used Matlab and Python. But lately I’ve been picking up R again, as it’s popular in the data science community. As practise / demo, I thought I’d do a simple exploration of the strengths and weaknesses of linear models versus decision trees. This was inspired by Claudia Perlich at kdnuggets. Let’s... full post»
I wrote about SCons several weeks ago. It took some getting used to, but I'm very happy with the result! It's found it to be more expressive and flexible than Make. Or at least non-standard things are much more straightforward; I guess Make can do almost anything with enough recursion and voodoo. I can't really comment on the speed; it's plenty fast, but there are less than 50 files. I took some time putting it together, partially because there aren't... full post»
Daily backups, even if incremental, can fill up your disk space pretty quickly. So I've added some functionality to Dory to remove old backups. I wanted a way to keep a number of backups, with recent backups being closer together and past ones further apart. I've ended up using the reciprocal time-distance to each neighbour, multiplied by the square root of the age, as a 'redundancy score'. This way, if there are close neighbours (e.g. .two backups made on the... full post»
Having solved backups, I attacked the the next important thing that gets in the way of building actual functionality: error logging. I've had some trouble setting up Sentry (possibly more my fault than Sentry's, but it does much more than I need so it's more work). I also tried Loggly which doesn't need server setup, but it captured mostly irrelevant syslog stuff and I'd need to pay a lot to keep logs for more than 7 days. So I figured... full post»
Ideally, you'd want your static files to be cached so that clients don't need to request them at all. In Apache, you can achieve this with: ExpiresActive on ExpiresDefault "access plus 1 year" Header append Cache-Control public But you also want the clients to request the new version as soon as anything changes. Since this can't be achieved with cache headers (they'll not request the files, so they never get the header), you can change the name instead. There's a... full post»
I made a backup script that is working pretty nicely for me so far. It has several function, the last one being the essence of it. Make a dump of all databases (postgres & mysql) to zipped sql files. It deletes any that haven't changed since the last backup. Bring a list of git repositories up-to-date with with their remotes on the current branch, by cloning or pulling. (This is useful to have a backup of a non-bare repository). Make... full post»
Logging the output of a shell script is easy enough: ./script.sh 1> log.out 2> log.err But if it the output is long and you find an error, you might want to know which output corresponds to it. Logging errors to one file, and output+errors to another is a little more tricky. And there are other potentially useful features: Truncate the log file if it gets too long. Keep the last part, and ideally whole lines. Add timestamps to each line... full post»
As a little addition to the concentration music post, I'd like to mention the situation where there are people talking around you and you can't help listening to what they say and it's very distracting. Instead of turning up your music really, really loud and ruining your ears, I find that you can tune out individual voices by simply adding more. This is well outside the realm of "music", but it can be useful. I put it on Youtube several... full post»
Natively, JSON files do not support comments. Douglas Crockford removed them early on to prevent parsing directives, and the json standard is frozen now so they'll never return. Douglas Crockford is, of course, a pretty smart guy, so I'm not going to argue that decision. But like many other people, I want comments anyway. There's a package commentjson, but unfortunately it does not support Python 3. There are several other json tricks I like, so I made my own package... full post»
The markv.nl blog!#tools, #china, #ubuntu, #data-science, #effecive-altruism, #useless, #example, #bash, #functional-programming, #rust, #shell, #fortran
- Dory backup scripts @Mark I've added some functionality to remove old backups! htt...
- Concentration noise @Mark Sorry if you're in China and can't access Youtube. If th...