Linux x86 read file

Display the contents of a text file in Linux 32-bit x86 assembly

I haven’t tested this (and it’s not necessarily NASM syntax), but something along these lines ought to work on an x86 Linux machine:

; Open file mov ecx,0 ; FILEMODE_R mov ebx,filePath mov edx,01FFh mov eax,5 ;__NR_open int 80h ; syscall mov fileHandle,eax . ; Read file data mov ebx,fileHandle mov ecx,buffer mov edx,numBytesToRead mov eax,3 ; __NR_read int 80h ; Write to STDOUT mov edx,numCharsToWrite mov ecx,buffer mov ebx,1 ; STDOUT mov eax,4 ; __NR_write int 80h ; Repeat as many times as necessary ; Close file mov ebx,fileHandle mov eax,6 ; __NR_close int 80h 

I deleted my answer as I misread the question. Before I did, Frank Kotler left the following comment on it (mistaking it for yours I guess): «Good answer, Michael! Only thing I would suggest is make sure the sys_open succeeds (eax > 0) before proceeding..»

@MathewHall: Right. I removed all the error checks in order to keep the code short. IRL one should of course assume that anything that can go wrong eventually will and check for errors.

use this in the terminal like, ./[program name] > destination.txt < source.txt, source being whatever file to copy from. this will copy byte by byte.. if you don’t specify a destination file this program will display the contents of your file to the terminal, ie [program name] < source.txt.

 SECTION .bss fileBuf: resb 1 SECTION .data SECTION .text global _start _start: nop read: mov eax, 3 ; sys_read mov ebx, 0 ; standard input mov ecx, fileBuf mov edx, 1 int 80h cmp eax, 0 ; ensure havn't read eof je exit write:mov eax, 4 ; sys_write mov ebx, 1 ; standard output mov ecx, fileBuf mov edx, 1 int 80h jmp read exit: mov eax, 1 ; wrap it up mov ebx, 0 int 80h 

Источник

Read file from a specific position in x86

Is it possible to start reading a file from a specific line or byte. Currently I use this code to read 4 bytes of a file:

section .data filename db "file.txt", 0 section .bss read_data resb 4 section .text global _start _start: mov rax, SYS_OPEN mov rdi, filename mov rsi, O_RDONLY mov rdx, 0 syscall push rax mov rdi, rax mov rax, SYS_READ mov rsi, read_data mov rdx, 4 syscall mov rax, SYS_CLOSE pop rdi syscall 

This code always reads the first 4 bytes, but I want to start reading from other parts of the file, like the middle for example. What do I need to add or change?

Читайте также:  Map file to memory linux

2 Answers 2

A freshly-opened file descriptor starts at position = 0. If you keep reading from the same fd in a loop, you’ll get successive chunks. (Use a larger buffer like 8kiB and loop over dwords in user-space, though, using the value that read returned as an upper limit! A system call is very expensive in CPU time.)

Is it possible to start reading a file from a specific line or byte.

  • Byte: yes
  • Line: no. In Unix/Linux, the kernel doesn’t have an index of line-start byte offsets or any other line-oriented API. The line handling in stdio fgets for example is purely done in user-space. There have been some historical OSes with record-based files, but Unix files are flat arrays of bytes. (They can have holes, unwritten extents, and extended attributes. But the kernel APIs for the main file contents only operate with by byte offsets).

If you want to do lines, read a big block and loop forward until you’ve seen some number of newlines. If you’re not there yet, read another block; repeat until you find the start and end of the line number you want, or you hit EOF. x86-64 can efficiently search 16 bytes at a time with pcmpeqb / pmovmskb / popcnt (popcnt requires SSE4.2 or the specific popcnt feature bit).

Or with just SSE2, or when optimizing for large blocks, with pcmpeqb / psadbw (against all-zero) to hsum bytes to qwords / paddd . Then check how many lines you went every so often with some scalar code. Or keep it simple and branch on finding the first newline in a SIMD vector.

Obviously the slow and simple option is a byte-at-a-time loop that counts ‘\n’ characters — if you know how to do strchr with SSE2 it should be straightforward to vectorize that search using the above suggestions.

But if you only want some specific byte positions, you have two main options:

  • seek with lseek(2) before read(2) (see @Nicolae Natea’s answer)
  • Use POSIX/Linux pread(2) to read from a specified offset, without moving the fd’s file offset for future read calls. The Linux system call name is pread64 ( __NR_pread64 equ 17 from asm/unistd_64.h ) ssize_t pread(int fd, void *buf, size_t count, off_t offset); The only difference from read is the offset arg, the 4th arg thus passed in R10 (not RCX like the user-space function calling convention). off_t is a 64-bit type simply passed in a single register in 64-bit code.

Other than the pread64 name in the .h , there’s nothing special about the asm interface compared to the C interface, it follows the standard system-calling convention. (It exists since Linux 2.1.60 ; before that glibc’s wrapper emulated it with lseek.)

Читайте также:  Установка linux nginx mysql php

There are other things you can do like mmap , or a preadv system call, but pread is most exactly what you’re looking for if you have a known position you want to read from.

Источник

NASM, read a file and print the content

I have this piece of code in NASM (for linux) that supposed to open an existing file, read it and print the content on the screen, but does not work, can someone tell me what am I doing wrong?( hello.txt is the name of the file)

section .data file db "./hello.txt", 0 len equ 1024 section .bss buffer: resb 1024 section .text global _start _start: mov ebx, [file] ; name of the file mov eax, 5 mov ecx, 0 int 80h mov eax, 3 mov ebx, eax mov ecx, buffer mov edx, len int 80h mov eax, 4 mov ebx, 1 mov ecx, buffer mov edx, len int 80h mov eax, 6 int 80h mov eax, 1 mov ebx, 0 int 80h 

2 Answers 2

mov ebx, [file] ; name of the file mov eax, 5 mov ecx, 0 int 80h 

Is wrong. Loose the square brackets around file . You are passing the file name instead of a pointer to the filename.

mov ebx, file ; const char *filename mov eax, 5 mov ecx, 0 int 80h 

I see here a lot of mistakes, in order:

mov ebx, [file] ; name of the file mov eax, 5 mov ecx, 0 int 80h 

Here, as said, u must lose square brackets (because the function needs a pointer, not a value)

mov eax, 3 mov ebx, eax mov ecx, buffer mov edx, len int 80h 

Here, u must save the file descriptor from eax, before writing there value 3, else u just loose it

mov eax, 4 mov ebx, 1 mov ecx, buffer mov edx, len int 80h 

Well. Here u using ebx register, so better way is to save file descriptor in memory. And for display, you take 1024 bytes from buffer, which is not correct. After reading from the file, the eax register will contain the number of characters read, so after reading from the file it will be better to store the value from the eax register in edx

Again. U close the file, but ebx contains dirt, although it must contain a file descriptor

Correct code must look like this:

section .data file db "text.txt",0 ;filename ends with '\0' byte section .bss descriptor resb 4 ;memory for storing descriptor buffer resb 1024 len equ 1024 section .start global _start _start: mov eax,5 ;open mov ebx,file ;filename mov ecx,0 ;read only int 80h ;open filename for read only mov [descriptor],eax ;storing the descriptor mov eax,3 ;read from file mov ebx,[descriptor] ;your file descriptor mov ecx,buffer ;read to buffer mov edx,len ;read len bytes int 80h ;read len bytes to buffer from file mov edx,eax ;storing count of readed bytes to edx mov eax,4 ;write to file mov ebx,1 ;terminal mov ecx,buffer ;from buffer int 80h ;write to terminal all readed bytes from buffer mov eax,6 ;close file mov ebx,[descriptor] ;your file descriptor int 80h ;close your file mov eax,1 mov ebx,0 int 80h 

This is not a perfect code, but it should work

Читайте также:  Удаленный доступ графический к linux

Источник

Reading from a file in assembly

I’m trying to learn assembly — x86 in a Linux environment. The most useful tutorial I can find is Writing A Useful Program With NASM. The task I’m setting myself is simple: read a file and write it to stdout. This is what I have:

section .text ; declaring our .text segment global _start ; telling where program execution should start _start: ; this is where code starts getting exec'ed ; get the filename in ebx pop ebx ; argc pop ebx ; argv[0] pop ebx ; the first real arg, a filename ; open the file mov eax, 5 ; open( mov ecx, 0 ; read-only mode int 80h ; ); ; read the file mov eax, 3 ; read( mov ebx, eax ; file_descriptor, mov ecx, buf ; *buf, mov edx, bufsize ; *bufsize int 80h ; ); ; write to STDOUT mov eax, 4 ; write( mov ebx, 1 ; STDOUT, ; mov ecx, buf ; *buf int 80h ; ); ; exit mov eax, 1 ; exit( mov ebx, 0 ; 0 int 80h ; ); 

A crucial problem here is that the tutorial never mentions how to create a buffer, the bufsize variable, or indeed variables at all. How do I do this? (An aside: after at least an hour of searching, I’m vaguely appalled at the low quality of resources for learning assembly. How on earth does any computer run when the only documentation is the hearsay traded on the ‘net?)

I’ve read to work this out by looking for the equivalent in C, but literally everything uses the stdio.h package instead of simply open , read and write . I’ve even looked at stdio.h , but the function bodies are not defined anywhere.

I meant that I was looking for the bodies of fopen , fread , and fwrite , to see how open , read , and write are used.

Источник

Оцените статью
Adblock
detector