August 2021 – Page 3

Measuring Execution Time of Java Methods

There are 2 ways to measure execution time of Java methods.

Wall Clock Time Vs. CPU Time

Wall Clock Time: Time that you actually wait for the method to complete. This metric is useful in measuring how fast a user perceives your method to be. For desktop apps and mobile apps, this measure is rarely accurate because we don’t know what other programs the user might be running in parallel when they are using our app. But this is very useful if you are measuring performance on server applications. Because they run on our server where we can control the capabilities and load from other programs. Wall clock time is what we measure in typical performance tests of web applications.
CPU Time: Time spent by the CPU executing the method. Since the CPU is shared by all processes running on a computer, this metric shows you how much of the resource your program is using – or in other words, for how much time, the CPU core was kept active by your program. In an ideal world, this would be exactly the same if you run the same code again and again, but our world is more ideal than ideal – both the JVM and the OS, execute code in quite complicated ways to optimise system usage. So there still might be differences between each execution but much less than wall clock time.

Source Code

You can get the source code for this article on GitHub – https://github.com/heppydepe/measuring-execution-time-of-java-methods/tree/v1.0

The Sample Class

We will use the following RandomNumbers class and measure it’s execution time. This is a neat little class for demonstrating time measurement – we can test the object creation because it creates 100 random numbers, we can test IO operation that happens when you execute writeToFile(), and you can test string conversion through the toString() method.

package com.codecreek.methodbenchmark;

import java.io.FileWriter;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class RandomNumbers {
    private List<Integer> list;

    public RandomNumbers() {
        Random rand = new Random();
        list = new ArrayList<>();
        for (int i = 0; i < 100; i++) {
            list.add(rand.nextInt());
        }
    }

    public void writeToFile() {
        FileWriter fileWriter = new FileWriter("output.txt");
        fileWriter.write(this.toString());
        fileWriter.close();
    }

    public String toString() {
        return list.toString();
    }
}

Measuring Wall Clock Time

Wall Clock Time can be measured using the System.nanoTime() method. Capture the elapsed time when the execution starts, then capture it again when the execution ends. There are a dozen ways of capturing system time, but I like the simple nanoTime(). For some other ways you can refer this StackOverflow answer.

try {
    for (int i = 0; i < iterations; i++) {
        startTime = System.nanoTime();
        randomNumbers.writeToFile();
        endTime = System.nanoTime();
        if (i >= skipFirst) {
            measureList.add(endTime - startTime);
        }
    }
} catch (IOException ex) {
    ex.printStackTrace();
}

System.out.println("Wall Clock Time:\t" + measureList);

Measuring CPU Time

CPU Time can be measured using the ThreadMXBean. To get an instance, we use ManagementFactory.getThreadMXBean() and then it has a method getCurrentThreadCpuTime() to get elapsed time of the current thread. Again, there are other ways of doing this, but for this article I’m just keeping to the simplest ways. You don’t need external dependencies for ThreadMXBean.

ThreadMXBean tmb = ManagementFactory.getThreadMXBean();
try {
    for (int i = 0; i < iterations; i++) {
        startTime = tmb.getCurrentThreadCpuTime();
        randomNumbers.writeToFile();
        endTime = tmb.getCurrentThreadCpuTime();
        if (i >= skipFirst) {
            measureList.add(endTime - startTime);
        }
    }
} catch (IOException ex) {
    ex.printStackTrace();
}

System.out.println("CPU Time:\t" + measureList);

Output

I ran the code on my computer and this is what I got –

Wall Clock Time:
        Stats{count=10000, mean=278124.58609999955, populationStandardDeviation=368752.52698052896, min=111579.0, max=1.258139E7}

CPU Time:
        Stats{count=10000, mean=210671.20000000033, populationStandardDeviation=86977.8137835161, min=110000.0, max=1794000.0}

I used the Stats functions from Google Guava library. Even in this relatively naive test, I could notice that the wall clock time has much greater variance than CPU time.

But I do have to reiterate. This test is practically good for my use case. But might not be for others. Because I haven’t considered what optimisations my JVM and my OS may have done while executing the code. This is more of a ‘getting-started’ kind of article. For more serious benchmarking it’s better to use tools like JMH or Google Caliper .

Simple Command Line Java Project (using Gradle)

I like to create a simple command line application whenever I want to learn a new concept/product or when I’m trying to implement a complicated logic. I never actually work on command line projects – for several years now I’ve been exclusively working on web applications. But still my go to approach for implementing something new is to first try it out in a command line program and then when I am comfortable, write it into my web application. Isolating new attempts like this helps me keep a clear mind and learn the new concept in more depth. Also it’s much more efficient – otherwise, I have to load up the entire web application just to see whether my new approach worked.

Why Not Just `javac` and `java`

Yes, it’s quite simple to just write a *.java file and compile it with javac and run it with the java command. But specifying and managing dependencies is a headache. Using a tool like Gradle or Maven would enable us to avoid that whole activity and focus on our project. So even for a one-file command line application, I will use something like Gradle.

Setup Gradle

If you have never initialised a Gradle project, chances are the you never had to install Gradle on your computer. The gradlew command that is available in Gradle projects take care of downloading and installing the required version of Gradle for that project. Well now you are going to initialise a new project. You need a standard installation of Gradle to do this.

My preferred way to install Gradle is to use SDKMAN! Just sdk install gradle and I have a clean installation of Gradle. After that do gradle -v to check the version of Gradle installed. For this tutorial, it doesn’t matter much. If you’ve installed a relatively new version it’s fine.

Create A New Project

For this example, let’s create a project called test-project. Steps to create a new project are –

Create a directory – Open a terminal and type mkdir test-project and then switch into the new directory cd test-project
Initialize a Gradle project – execute gradle init and then answer the questions that the script asks. For this project the answers would be –
1. Type of project is ‘application’. ‘basic’ is basically an empty project – we don’t want that. Choose ‘application’
2. Implementation language is ‘Java’
3. Split into sub projects – ‘no’. We just need a simple project.
4. Build script DSL – ‘Groovy’. Selecting Kotlin doesn’t make much of a difference anyway. Being statically typed, Kotlin makes it easier for IDEs to do fancy things like auto-complete. But Groovy is quite fine for me for this project.
5. Test framework – ‘JUnit Jupiter’. I like JUnit 5 (also called JUnit Jupiter)
6. Project name – Accept the default – which is the directory name test-project. You can give a different name if you like.
7. Source package – If the project is never going to leave your computer, simply accept the default or type whatever you like. If you’re going to share it, it’s a good idea to put some unique string here. Real projects use their domain names to make package names unique – org.apache.commons etc.

Keep in mind about reserved keywords when you name things. Source package names like java.test or try.newtech etc. will fail. Because try is a reserved keyword in java and java itself is forbidden to be used in package names.

That’s it. You’ve made a neat little application for yourself to play with.

Run the Project

Simply execute ./gradlew run to run your new little application. As of today, the default project prints out the string Hello World!

You can open this folder in your favourite text editor or IDE and go about making changes. The source code would be inside the app/src/main directory.

Execute ./gradlew tasks to see what tasks are available for you to run. If like me, you are doing this to do a trial implementation of some new algorithm or library, ./gradlew run is all you need.

Get Cracking

Open the build.gradle file in your project folder to add dependencies. Use the app/src/main/java/{{source-package}}/App.java file as a starting point for your code.

Of course this is only one of the countless ways in which you could put together a stub project for trying out new things, but it’s a neat and simple way. And since it’s Gradle, pretty much all of your dependencies are easy to add into your project. And you don’t have to suffer with a bulky big project while you are trying to learn your new technology or implement your new idea.

Thoughts on Performance Analysis

The application that I’m working on – a REST API, has a full fledged performance test suite that’s run for every release. That test suite is owned by a separate set of people responsible for performance testing. As a developer I limit myself to take inputs from the performance tests, and tune the application wherever necessary. But I think there are some basics every performance test project should get down.

It is common knowledge that a basic performance test should measure both requests/second and response times. So the basic output would be a chart –

Simple, but probably useless performance report

The Problem

The commitment to your customer might be something like “less than 500 ms response times for load upto 500 requests/second“. The performance test will generate a chart like the one above, and you can use it to show the the performance is under 500 ms as committed for upto 500 req/s. Also, the chart shows that there is a disproportionate performance decrease after a certain req/s and then it gets exponentially worse. You can decide whether or not to address this based on your budget and whether your user might be expected to increase their work load to more than the problem point.

Now my problem is, how do I define ‘requests/second’. In almost all applications, there are vastly different types of requests. Just executing one type of request repeatedly and using it to derive a performance test would be absolutely useless. Measuring for all the different types of requests, and then aggregating by average or median is also quite useless. In fact, what this translates to, is that the “req/s” attribute is useless.

Define KPIs

You should define KPIs based on the functionality of the application. Instead of “req/s”, you have to create meaningful attributes like “user logins per second”, “orders created per second” or “get-product-by-id requests per second” etc. You need to define metrics also – almost always this is ‘response time’ because that’s the one metric that affects the users directly. But your application can have other metrics also, like volume of logs produced, number of network calls made, amount of RAM utilised. Many of these are not even measured nowadays because it has become common to drastically oversize the hardware required to run software. But still, maybe some are useful.

I believe the above point is one of the most ignored things in software development. You need to have clearly defined KPIs before you do your performance tests. Otherwise the performance testers will put whatever they feel like into the report, and you’ll end up with a vague and crowded document as a performance report.

Create an Environment

Second, you have to get down the environment right. The infrastructure you use for performance testing must ideally be the same size as the one you use for production. The data already present on the system must be the same size as production. For example, the number of products in the database, will most likely affect how fast one product can be retrieved. So you need to have a ‘seeding’ process where you populate your performance test stack with production-like data.

It helps to create a ‘seeding program’ that creates test data based on a configuration file. The config file might have settings like number of users to create, the number of products to create, the number of orders the users might already have etc. These settings can also be ‘ranges’ – eg., each user to have 0 to 30 orders distributed randomly. This might seem like overkill, but trust me, it will up your performance testing game to epic levels. It will give you the confidence to say you can performance test to any given scenario and consequently, future-proof your application.

Report to the Business

Make your report for the business to understand. Don’t make it a technical report like “response time vs. req/s”. Your report should tell stories – like “If number of logged in users increases by 50%, what is the response time for order registration’? If you have started your process by clearly defined KPIs then your report will come out great naturally. And it will communicate directly what needs to be done for the future, and what risks are there in the system.

“All requests must respond within a certain response time, if it doesn’t, just spend more money and throw more hardware at it” – seems to have become a common excuse for a proper performance tune up. Requests are not equal. One type of request might have to talk to the database. One type of request might have to wait for another service on the internet. It’s simply wrong, and very inconvenient to group all requests of an application together. Don’t do that. Go one step further and save some money by doing a proper performance analysis.

403 vs 404

The most recent dilemma I faced at work is to decide between returning a 403 (Unauthorised) error as against a 404 (Not Found) error in an API GET call.

The scenario is a resource does not exist on the server, and a user has made a GET call for that resource. The application as part of it’s normal execution, does an authorisation check and reports a 403 error. Now my users are getting confused – have they tried an invalid resource ID or are they having problems with authorisation?

Thinking puristically, we had initially decided to return the errors as is. That is, a 403 would be returned always, even for resources not found on the server. The reasoning was that this benefitted security. It is better to not even let an unauthorised person know whether or not a resource exists. This was an easy decision and doesn’t need much convincing the product mangers. When you say it’s for security, it almost always gets accepted.

But this time around, the decision differed. The same people who had decided 403, now suggested 404. It turns out, the alleviating the users confusion between whether they didn’t have authorisation or whether they’ve made a mistake in the resource ID or whether the dog ate the resource, and so on, was much more important that security.

We went with returning a 404. First the application checks for the resource existence, returns a 404 error if it does not exist. If the resource does exist, we check for authorisation and return 403 error if the user does not have authorisation.

The lesson learnt is that purist attitudes like “correct design” or “security first”, will almost always take a back seat to user experience. Users will not choose your application because it’s architected well or it is secure. Users will choose the application that’s easy for them to use. If you forget this, you will end up with unsatisfied users, who you will not be able to convince that you made decisions for better security or best practice or whatever else.

As a software maker, it’s your responsibility to make software that is first comfortable for your user to use, and then, still ensure that it is secure. So not just for “403 or 404?”, for any such conundrums you face while creating software, remember, customer satisfaction will always beat other factors that may contribute to your decision.

To answer the question “403 or 404?” – Do what your customer will like.

Don’t Watch When You Can Read

For a few years, until a couple months ago, I was on a video lessons subscribe spree, registering for video lessons on almost every topic on earth. Drawing lessons, piano lessons, programming courses and so on. Not only those, but also several YouTube channels for entertainment – fun facts, interesting science stuff, crime reports, movie star interviews … just about everything was videos, videos, videos. The explosion of videos on YouTube and the tons of video courses that were dead cheap, had me thinking there’s so much I could learn for so little money.

As I was thinking all is well, they gave me an ebook library subscription at work. I went on to pick a couple ebooks on it and went about reading them. That’s when I realised, videos are no match for reading. I’m not sure if this is just for me or whether it applies to everyone. I find that it’s much faster to read from books than learn from videos. I also find reading more effective.

Reading is actually faster. It takes only a few minutes to read almost any article such as this blog. This is because we are trained to skip a lot of words and move through the text very fast. Over the years I suppose our brains have developed to a state where they know what is being read without having to read the entire syntax or even all the words. We read by keywords. Also, reading was at my speed. If the material was difficult, I read slow, if it’s easy, I read fast. A video would read at the presenter’s speed. I can maybe do 1.5x or 2x speed on the video but that’s not even the same thing.

It’s possible to copy-paste. If I’m going through a tutorial or a lesson on an ebook or an internet article, when I want to try out some command or a code snippet, I can simply copy-paste it. Even when I want to use it as a sample and write my own code, I still copy-paste it to my text editor as a reference. Needless to say, this was out of the question when my search results sent me to a YouTube video.

Progress only if I’m there. If my mind wanders off mid way for a few minutes, when I come back I’m not even sure where I left the video at, it keeps running anyway. This is not a problem while reading. You can simply return to the spot on the page where you were before you started day dreaming.

Less distractions. Of course it’s being extremely difficult to be without distractions today. But still, there’s a big distance between reading and watching videos in this matter. Almost all video sites constantly badger you with ads, suggestions on what to watch next, related videos, comments and so on. These are all much lesser when you are reading. And practically non-existent if you are reading an ebook.

Correctness. I believe text material such as books and reference websites are inherently less error prone. Simply because making text is much simpler and hence there could be more focus on accuracy. And another more important reason is that most video makers don’t bother to edit and repost their videos in case there is a mistake. Maybe they’ll drop a comment or a note in the description – which we might not notice. But books and articles are easy to update in case an error is spotted.

It’s more entertaining. When you are consuming for entertainment, reading a story engages you much more than watching it on a video. You imagine the visuals and sounds as you read the story. It builds more connections in your brain and it is a more active task for your mind. The TV used to be called the idiot box for a reason.

There are a few areas in which videos excel – for example in presenting 3D pictures – like how are electrons positioned in an atom, in presenting art-related education – like how to play a musical instrument and so on. But these are only a few. For the most part, especially if I’m trying to learn something, I find reading is definitely more effective than watching.