Select columns from a file

Posted: November 27, 2012 in Linux, Uncategorized
Tags: , , , , , , , ,

In this post, i gonna share the simple ways in which we can select columns from a text file using unix commands. I have posted screen shots for better understanding.

1. lets create a sample data and store it in sample.txt. i have used cat command to print the file contents.

content : 
aaa bbb ccc ddd
eee fff ggg hhh
iii jjj kkk lll
mmm nnn ooo ppp
aaa bbb ccc ddd
cat sample.txt

sample data

2. consider we just need to fetch the first column from our file. we can do that using awk command

awk '{print $1}' sample.txt

awk command

3. okay. now we need multiple columns from our file . then we can do like

awk '{print $1$2$3}' sample.txt

awk command

4. From the picture above we can realize that the contents are simply concatenating. Lets add a field seperator at the end of each column for readability

awk '{print $1"|"$2"|"$3}' sample.txt

awk command

5. now we have delimited file.. lets extract it and save in extract.txt.

awk '{print $1"|"$2"|"$3}' sample.txt > extracted.txt

awk command

6. lets pull up the first column data from the extracted file by applying the awk command in step 2

awk '{print $1}' extracted.txt

awk command

7. Oh god! Something is wrong. did you notice. we have come up with wrong result .. why?
Its because the default field separator is space(” “). but here we have used a delimiter(|) that we forget to add in awk command. now, lets try with delimiter..

awk -F "|" '{print $1}' extracted.txt

awk command

8. yeah.. we got the output.. lets try this very same process with cut command which is easier to apply.
we fetch the first column data using cut command. here -d denotes the delimter and -f denotes the field number.

cut -d "|" -f 1 extracted.txt

cut command
9. we got a column data. lets keep it sorted . For that just use pipe which will give the previous output to it .

cut -d "|" -f 1 extracted.txt | sort

cut command

10. The contents are sorted. But did you notice ? some contents are repeating.. i dont want that.. so i have to use sort -u.. where -u specifies unique

cut -d "|" -f 1 extracted.txt | sort -u

cut command

11. if you want to find the place of occurence of a word in a file just use grep command

cut -d "|" -f 1 extracted.txt | grep aaa

cut command

12. if i need to know, the number of occurence of a word in a file.. add wc -l which will give you that result.

cut -d "|" -f 1 extracted.txt | grep aaa | wc -l

cut command


Comments are closed.