Linux generate random files

Generating a random binary file

Why did it take 5 minutes to generate a 1 KiB file on my (low-end laptop) system with little load? And how could I generate a random binary file faster?

$ time dd if=/dev/random of=random-file bs=1 count=1024 1024+0 records in 1024+0 records out 1024 bytes (1.0 kB) copied, 303.266 s, 0.0 kB/s real 5m3.282s user 0m0.000s sys 0m0.004s $ 

Notice that dd if=/dev/random of=random-file bs=1024 count=1 doesn’t work. It generates a random binary file of random length, on most runs under 50 B. Has anyone an explanation for this too?

5 Answers 5

That’s because on most systems /dev/random uses random data from the environment, such as static from peripheral devices. The pool of truly random data (entropy) which it uses is very limited. Until more data is available, output blocks.

Retry your test with /dev/urandom (notice the u ), and you’ll see a significant speedup.

See Wikipedia for more info. /dev/random does not always output truly random data, but clearly on your system it does.

$ time dd if=/dev/urandom of=/dev/null bs=1 count=1024 1024+0 records in 1024+0 records out 1024 bytes (1.0 kB) copied, 0.00675739 s, 152 kB/s real 0m0.011s user 0m0.000s sys 0m0.012s 
$ time dd if=/dev/urandom of=random-file bs=1 count=1024 

The main difference between random and urandom is how they are pulling random data from kernel. random always takes data from entropy pool. If the pool is empty, random will block the operation until the pool would be filled enough. urandom will genarate data using SHA(or any other algorithm, MD5 sometimes) algorithm in the case kernel entropy pool is empty. urandom will never block the operation.

I wrote a script to test various hashing functions speeds. For this I wanted files of «random» data, and I didn’t want to use the same file twice so that none of the functions had a kernel cache advantage over the other. I found that both /dev/random and /dev/urandom were painfully slow. I chose to use dd to copy data of my hard disk starting at random offsets. I would NEVER suggest using this if you are doing anythings security related, but if all you need is noise it doesn’t matter where you get it. On a Mac use something like /dev/disk0 on Linux use /dev/sda

Here is the complete test script:

tests=3 kilobytes=102400 commands=(md5 shasum) count=0 test_num=0 time_file=/tmp/time.out file_base=/tmp/rand while [[ test_num -lt tests ]]; do ((test_num++)) for cmd in "$"; do ((count++)) file=$file_base$count touch $file # slowest #/usr/bin/time dd if=/dev/random of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file # slow #/usr/bin/time dd if=/dev/urandom of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file # less slow /usr/bin/time sudo dd if=/dev/disk0 skip=$(($RANDOM*4096)) of=$file bs=1024 count=$kilobytes >/dev/null 2>$time_file echo "dd took $(tail -n1 $time_file | awk '') seconds" echo -n "$(printf "%7s" $cmd)ing $file: " /usr/bin/time $cmd $file >/dev/null rm $file done done 

Here is the «less slow» /dev/disk0 results:

dd took 6.49 seconds md5ing /tmp/rand1: 0.45 real 0.29 user 0.15 sys dd took 7.42 seconds shasuming /tmp/rand2: 0.93 real 0.48 user 0.10 sys dd took 6.82 seconds md5ing /tmp/rand3: 0.45 real 0.29 user 0.15 sys dd took 7.05 seconds shasuming /tmp/rand4: 0.93 real 0.48 user 0.10 sys dd took 6.53 seconds md5ing /tmp/rand5: 0.45 real 0.29 user 0.15 sys dd took 7.70 seconds shasuming /tmp/rand6: 0.92 real 0.49 user 0.10 sys 

Here are the «slow» /dev/urandom results:

dd took 12.80 seconds md5ing /tmp/rand1: 0.45 real 0.29 user 0.15 sys dd took 13.00 seconds shasuming /tmp/rand2: 0.58 real 0.48 user 0.09 sys dd took 12.86 seconds md5ing /tmp/rand3: 0.45 real 0.29 user 0.15 sys dd took 13.18 seconds shasuming /tmp/rand4: 0.59 real 0.48 user 0.10 sys dd took 12.87 seconds md5ing /tmp/rand5: 0.45 real 0.29 user 0.15 sys dd took 13.47 seconds shasuming /tmp/rand6: 0.58 real 0.48 user 0.09 sys 

Here is are the «slowest» /dev/random results:

dd took 13.07 seconds md5ing /tmp/rand1: 0.47 real 0.29 user 0.15 sys dd took 13.03 seconds shasuming /tmp/rand2: 0.70 real 0.49 user 0.10 sys dd took 13.12 seconds md5ing /tmp/rand3: 0.47 real 0.29 user 0.15 sys dd took 13.19 seconds shasuming /tmp/rand4: 0.59 real 0.48 user 0.10 sys dd took 12.96 seconds md5ing /tmp/rand5: 0.45 real 0.29 user 0.15 sys dd took 12.84 seconds shasuming /tmp/rand6: 0.59 real 0.48 user 0.09 sys 

You’ll notice that /dev/random and /dev/urandom were not much different in speed. However, /dev/disk0 took 1/2 the time.

Читайте также:  Просмотр текущей директории linux

PS. I lessen the number of tests and removed all but 2 commands for the sake of «brevity» (not that I succeeded in being brief).

Источник

Создание рандомного файла

Собственно, нужно создать большой по размеру(~100mb) файл, с рандомными символами. Желательно сделать это средствами системы и максимально просто. Есть идеи?

2 ответа 2

Можно записать ~10 8 случайных байт из /dev/urandom

head -c 100000000 /dev/urandom > file 
dd if=/dev/urandom of=file bs=100M count=1 iflag=fullblock 

Можно записать только печатные символы как-то так:

Можно и по совету использовать base64:

Вроде при таком подходе это будет столь же безопасно (никакой конец обрезать не надо), но даст меньший набор печатных символов.

можно использовать base64 для более «экономичного» получения печатных ascii-символов из набора байт. только последние несколько байт вывода программы лучше отрезать — они весьма «неслучайны».

Если задача именно создать «заглушку» то есть уже готовое решение в util-linux

~$ fallocate --help Usage: fallocate [options] Preallocate space to, or deallocate space from a file. Options: -c, --collapse-range remove a range from the file -d, --dig-holes detect zeroes and replace with holes -l, --length length for range operations, in bytes -n, --keep-size maintain the apparent size of the file -o, --offset offset for range operations, in bytes -p, --punch-hole replace a range with a hole (implies -n) -z, --zero-range zero and ensure allocation of a range -v, --verbose verbose mode -h, --help display this help and exit -V, --version output version information and exit For more details see fallocate(1). 

Источник

Generate a random filename in unix shell

I would like to generate a random filename in unix shell (say tcshell). The filename should consist of random 32 hex letters, e.g.:

c7fdfc8f409c548a10a0a89a791417c5 

(to which I will add whatever is neccesary). The point is being able to do it only in shell without resorting to a program.

14 Answers 14

Assuming you are on a linux, the following should work:

cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32 

This is only pseudo-random if your system runs low on entropy, but is (on linux) guaranteed to terminate. If you require genuinely random data, cat /dev/random instead of /dev/urandom . This change will make your code block until enough entropy is available to produce truly random output, so it might slow down your code. For most uses, the output of /dev/urandom is sufficiently random.

Читайте также:  Linux ubuntu сервер терминалов

If you on OS X or another BSD, you need to modify it to the following:

cat /dev/urandom | env LC_CTYPE=C tr -cd 'a-f0-9' | head -c 32 

This solution was actually doing weird things for me, as it appended a white-backgrounded «%» sign after the actual random hash, but because my shell is generally behaving strange on some occasions, I didn’t want to make this look bad before it was accepted 🙂

I tried this on a Mac, which has a /dev/urandom. Executing the command in a bash shell causes an error — ‘tr: Illegal byte sequence’

I think the problem here is that BSD and Mac’s interpret the string as being multibyte instead of single byte. I haven’t got a machine to try this on, so report back here if this works: cat /dev/urandom | env LC_CTYPE=C tr -cd ‘a-f0-9’ | head -c 32

why do not use unix mktemp command:

$ TMPFILE=`mktemp tmp.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX` && echo $TMPFILE tmp.MnxEsPDsNUjrzDIiPhnWZKmlAXAO8983 

some implementations have a —dry-run flag to prevent a file being created. That of course opens a possible race condition.

While it is not a big deal, this utility makes an extra IO check to test whether the XXXX file exists on disk (just a stat) even when —try-tun is specified. Just a minor consideration and maybe an insignificant trade off for a great convenience

One command, no pipe, no loop:

hexdump -n 16 -v -e '/1 "%02X"' -e '/16 "\n"' /dev/urandom 

If you don’t need the newline, for example when you’re using it in a variable:

hexdump -n 16 -v -e '/1 "%02X"' /dev/urandom 

Using «16» generates 32 hex digits.

uuidgen generates exactly this, except you have to remove hyphens. So I found this to be the most elegant (at least to me) way of achieving this. It should work on linux and OS X out of the box.

As you probably noticed from each of the answers, you generally have to «resort to a program».

However, without using any external executables, in Bash and ksh:

string=''; for i in ; do string+=$(printf "%x" $(($RANDOM%16)) ); done; echo $string 
string=''; for i in ; do string+=$(printf "%x" $(($RANDOM%16)) ); dummy=$RANDOM; done; echo $string 

Change the lower case x in the format string to an upper case X to make the alphabetic hex characters upper case.

Here’s another way to do it in Bash but without an explicit loop:

printf -v string '%X' $(printf '%.2s ' $((RANDOM%16))' ') 

In the following, «first» and «second» printf refers to the order in which they’re executed rather than the order in which they appear in the line.

This technique uses brace expansion to produce a list of 32 random numbers mod 16 each followed by a space and one of the numbers in the range in braces followed by another space (e.g. 11 00 ). For each element of that list, the first printf strips off all but the first two characters using its format string ( %.2 ) leaving either single digits followed by a space each or two digits. The space in the format string ensures that there is then at least one space between each output number.

Читайте также:  Линукс где найти флешку

The command substitution containing the first printf is not quoted so that word splitting is performed and each number goes to the second printf as a separate argument. There, the numbers are converted to hex by the %X format string and they are appended to each other without spaces (since there aren’t any in the format string) and the result is stored in the variable named string .

When printf receives more arguments than its format string accounts for, the format is applied to each argument in turn until they are all consumed. If there are fewer arguments, the unmatched format string (portion) is ignored, but that doesn’t apply in this case.

I tested it in Bash 3.2, 4.4 and 5.0-alpha. But it doesn’t work in zsh (5.2) or ksh (93u+) because RANDOM only gets evaluated once in the brace expansion in those shells.

Note that because of using the mod operator on a value that ranges from 0 to 32767 the distribution of digits using the snippets could be skewed (not to mention the fact that the numbers are pseudo random in the first place). However, since we’re using mod 16 and 32768 is divisible by 16, that won’t be a problem here.

In any case, the correct way to do this is using mktemp as in Oleg Razgulyaev’s answer.

Источник

generate a random file using shell script

How can i generate a random file filled with random number or character in shell script? I also want to specify size of the file.

6 Answers 6

Use dd command to read data from /dev/random.

dd if=/dev/random of=random.dat bs=1000000 count=5000 

That would read 5000 1MB blocks of random data, that is a whole 5 gigabytes of random data!

Experiment with blocksize argument to get the optimal performance.

After a second read of the question, i think he also wanted to save only characters (guessing alphabetic ones) and numbers to the file.

That dd command is unlikely to complete as there will not be 5 gigabytes of entropy available. Use /dev/urandom if you need this much «randomness».

head -c 10 /dev/random > rand.txt 

change 10 to whatever. Read «man random» for differences between /dev/random and /dev/urandom.

Or, for only base64 characters

head -c 10 /dev/random | base64 | head -c 10 > rand.txt 

The base64 might include some characters you’re not interested in, but didn’t have time to come up with a better single-liner character converter. (also we’re taking too many bytes from /dev/random. sorry, entropy pool!)

oops, missed the characters and numbers part, i’m guessing you mean alphanumeric characters. need to revise.

#!/bin/bash # Created by Ben Okopnik on Wed Jul 16 18:04:33 EDT 2008 ######## User settings ############ MAXDIRS=5 MAXDEPTH=2 MAXFILES=10 MAXSIZE=1000 ######## End of user settings ############ # How deep in the file system are we now? TOP=`pwd|tr -cd '/'|wc -c` populate() < cd $1 curdir=$PWD files=$(($RANDOM*$MAXFILES/32767)) for n in `seq $files` do f=`mktemp XXXXXX` size=$(($RANDOM*$MAXSIZE/32767)) head -c $size /dev/urandom >$f done depth=`pwd|tr -cd '/'|wc -c` if [ $(($depth-$TOP)) -ge $MAXDEPTH ] then return fi unset dirlist dirs=$(($RANDOM*$MAXDIRS/32767)) for n in `seq $dirs` do d=`mktemp -d XXXXXX` dirlist="$dirlist$$PWD/$d" done for dir in $dirlist do populate "$dir" done > populate $PWD 

Источник

Оцените статью
Adblock
detector