thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations.

Someone linked https://github.com/robohack/experiments/blob/430b5ea22bc2f4f697c659aeb399e938d09744c1/thello.s for an example of a BSD build command, which is why I'm randomly looking at it.

It has one bug (in a comment): syscall definitely can't take an arg in RCX, the syscall instruction itself destroys RCX before the kernel gets control. (  https://stackoverflow.com/questions/32253144/why-is-rcx-not-used-for-passing-parameters-to-system-calls-being-replaced-with)  Linux uses R10 instead of RCX, with the rest of the convention matching the function-calling convention.  I'd guess most other x86-64 SysV OSes do the same, but I don't know for sure.

```
# SYSCALL ARGS
# rdi rsi rdx r10 r8 r9
  # wrong original: # rdi rsi rdx rcx r8 r9    # that's the function-calling convention.
```

Separately from that:

```
	andq $-16, %rsp		# clear the 4 least significant bits of stack pointer to align it
```
RSP is already aligned by 16 on process entry, as guaranteed by the x86-64 System V ABI.

```
	mov $4, %rax		# SYS_write
	mov $1, %rdi
```
 You can `mov $4, %eax` to do this more efficiently (implicit zero-extension to 64-bit), especially if you're later trying to optimize by merging a length into the low by of RDX (which most kernels zero on process entry).  Also, you can `#include <sys/syscall.h>` to get call numbers as CPP macro #defines, so you can `mov $SYS_write, %eax`.  (Call your file `.S` so gcc will run it through CPP first).

You can use `as -O2` or `-Os` to do simple optimizations like `mov $4, %rax` into `mov $4, %eax` like NASM does, because the architectural effect is identical.  (If using GCC, `-Wa,-O2`, *not* `gcc -O2`)

```
	mov $hello, %rsi
```
Using a 32-bit sign-extended immediate for an absolute address is possible but inefficient.  Normally you'd use `lea hello(%rip), %rsi`, or `mov $hello, %esi` (if 32-bit sign-extended works, so does zero-extended, assuming user-space using the bottom of the virtual address space, not the top.)  https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-register

```
	mov $1, %rax		# SYS_exit
	xor %rdi, %rdi
```
Again, 32-bit operand-size is 100% fine, especially for the xor since `exit()` takes an int arg.   See my answer on https://stackoverflow.com/questions/33666617/what-is-the-best-way-to-set-a-register-to-zero-in-x86-assembly-xor-mov-or-and


Putting a constant byte in static storage is just silly; make it an assemble time constant you can use as an immediate like `mov $hello_len, %edx`  (Or %rdx if you want).

```
.section .data       # could be .section .rodata

hello:
	.ascii "Hello world!\n"
hello_len = . - hello
# .equ hello_len,  . - hello     # alternative using .equ

	#.byte . - hello
```

So

```
	mov hello_len, %dl	# Note: does not clear upper bytes. Use movzxb (move zero extend) for that
```

becomes

```
	mov $hello_len, %edx       # zero-extends to fill RDX
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions