Writing full applications in x64 assembly: dclock Part 1

Most will say it’s pointless or you’re crazy for writing an entire application in assembly, many hate using gcc instrinsics or inline assembly because compilers can create “better, faster” code than coding assembly by hand. While that may hold some truth, why do we write everything in assembly? Because it’s fun! It’s challenging and we love low-level stuff. You will learn how some of the higher level code works, how to implement the functions we use everyday.

I will go through my application dclock and talk about the development of it and the pitfalls and all the annoying problems I ran into. So what is it? Well it breaks up your day into 1000 decimal minutes so you see it more of what you have left in the day before you get another 1000 decimal minutes. I first saw this when I purchased a “watchy” which is a DIY watch and saw a user had created a watch face called the calculateur. I created one for the sensor watch afterwards and now I decided to write one for the Linux command line.

When writing assembly programs it is good to have snippets of code saved and reusable code so you are not constantly re-writing simple functions that help the overall program. In dclock we will use print_str, uint64_to_ascii and strn_cmp that I constantly re-use in my assembly programs.

Requirements/Planning

So what do we need to make this program work? We need to find a syscall that will get us the current system time in seconds. We use sys_clock_gettime that will return to us a timespec struct. Which is seconds and nanoseconds, and I chose to go with nanoseconds to try and get the most precise time we could for the decimal time. Here is where the hard part came in, normally in C we could use localtime() to convert the seconds/nanoseconds from epoch to our local timezone. We don’t link with libc right? So I had to create the functions we needed to get to our local timezone. This took ALOT of looking around and debugging.

In the code you will see we have the functions: is_leap_year, get_day_of_month, get_hms and lastly get_year_from_epoch_sec. After these in our “main” function (_start) we parse program arguments (if any) and essentially call the functions in order to get us to the point to calculate. Wait a minute you say, you’re retrieving nanoseconds since epoch in the timespec struct why can’t we directly calculate from that? Well… we could, but we are going to print an “expanded” format that will show the date. So we will need to convert anyway. This is why later on I decided to check command line arguments for “-v” which shows version information and we will eventually check for a “-e” for expanded form.

Lastly after it was completed I decided to use a config file to store the timezone offset instead of hardcoding it in to the program.

Code Organization

Now, first thing from the very beginning it is good to understand how we want to organize the code. Assembly programs get large really quickly so having an idea of how we want to organize it will help us as it grows. Using descriptive labels, logical layout of code that will help with uneeded jumps or jumping back and forth etc.. will help optimize and making the code more understandable. The order I typically use is as follows:

1.) Comment section describing the file, usage, compiling and copyright/license information. You will notice in all my programs I am really big on comments. I will comment as much as can for heading comments, functions, line comments, etc.. Anything that helps people understand the code or help me coming back to it because assembly is not very descriptive when it comes to what you are doing. So never feel you are commenting to much. I 100% disagree that you can over comment your code.

# -----------------------------------------------------------------------------
# dclock.s - Decimal clock that maps each day to 1000 decimal minutes.
#
# Build: run make
#   or
# as --64 -o dclock.o dclock.s
# ld -o dclock dclock.o
#
# First it looks in /usr/local/etc/dclock.conf for a UTC offset value the
# default in the file is -6 and hardcoded is -6.
#
# Implementation details from: https://jochen.link/experiments/calculateur.html
#
# "
# Using the number 1000 as a reference is precise enough for everyday use and 
# relatable when referring to specific parts of the day. Midnight is 1000 
# (displayed as “NEW”), noon 500, and teatime 333. Even though it is technically 
# a countdown it is not perceived like that, since checking the time usually 
# happens at a glance, not continuously. The displayed number represents all the 
# time we can still use, before we get another 1000 decimal minutes.
# "
# 
# Here it is in GNU x64 assembly form.
# Code is 100% free and public domain.
# Travis Montoya <trav@hexproof.sh>
# -----------------------------------------------------------------------------

2.) Equates. I will create equates for the syscalls used in the program and other variables that are used repetitively. This will help with code clarity as we read through the program you see a move of some $xx number into %rax, but what syscall is that? Creating equates is really good idea to help with this. Numbers, when calculating things or reusing the same number creating an equate for it will help us understand what we are calculating.

.set CLOCK_REALTIME, 0
.set STDOUT, 1

.set __NR_clock_gettime, 228
.set __NR_read, 0
.set __NR_open, 2
.set __NR_write, 1
.set __NR_close, 3

.set DEFAULT_TIMEZONE_OFFSET, -6 * 3600

.set NANO_PER_DAY, 86400000000000

3.) Next will be our data section. I know this is unusual to some, but just how I learned growing up I have always but data section before the text section. You can add it at the bottom if you like, but I will explain in part 2 more in depth why I do it this way. Example if you like at my project elfparse which is 2,000+ lines of assembly the data section alone is a few hundred lines.

4.) .bss (uninitialized data section) We try to limit memory access in assembly programs because it’s slower, so we will create only the few (3 in our case) that is needed. We try to keep track of registers and utilizing the stack to carry the information across functions.

5.) .text section (code section) this will be broken up into _start, locals that are local to start, program functions (so not everything is done in _start) and then helper functions. Functions that are not directly related to the program but help convert, print, etc…

Build system

Assembly? Build system? What? Creating a Makefile from the beginning will help building, debugging and cleaning the project more than typing as and ld every time to build. Especially if you plan on releasing the program it will make it easier for others to build. If it is meant to be installed you can create where it should be installed a lot easier.

Now when we want to build, lets say dclock all we have to do is:

$ make
as --64 dclock.s -o dclock.o
ld -o dclock dclock.o
$ ./dclock
Decimal time: 334
$

Simple. Clean. Easy.

What’s next?

Now that we have discussed the plan, code organization and how we are going to build it. Next in Part 2 we will start dissecting the program and talk about the development, issues and how I worked through some of them in the debugger.

Github link

You can clone this repository so you can follow along as we debug it and go through the code.

$ git clone https://github.com/Hexproofsh/dclock

Hexproof.sh Research