Extract multiple columns like this example?

Hi all, i am a newbie for this and I have this problem.
I have a csv file like this

ecStudentID,StudentNetworkUsername,StudentName,StudentFirstName,StudentMiddleName
100,ASTDTBJ001,"Andrew,Logan Connor",Logan    ,Connor
100,ASTDTBJ001,"Andrew,Logan Connor",Logan    ,Connor
101,BSTLTAN001,"Isaac,Matthew ",Matthew  ,,Isaac

How can i extract StudentFirstName column and ecStudentID but without duplicate data. Example like this:

ecStudentID,,StudentFirstName
100,Logan
101,Matthew.

PS: Is there anyway to do this with condition like while or for.
Thank in advance.

Hello,

Welcome to the forum ! We hope you find this to be a friendly and helpful place. Firstly, could I ask you to confirm if this is related to academic work, schoolwork, coursework or homework of any kind ? If it is then we can still help you, but we'd need that to be declared clearly and up-front from the start, since whether or not your question is related to coursework will determine exactly what type of assistance we'd be able to provide.

Secondly, could you provide a bit more information on your operating environment, please ? At a minimum it would be helpful to know the operating system, version and shell that you're using (e.g. Ubuntu 20.04 LTS on x86_64 using the Bash shell). Knowing the OS and shell is helpful because it gives us a better idea of the tools and options that may be available to you in implementing your solution.

If you could get back to us with answers to the above questions, then we can take things from there.

hi this is for my homework from my school and I'm trying to solve it. And about the OS I use the shell bash on visual code studio, WSL linux I think. Thank you

Hello,

Thank you for confirming that - I've now moved this thread to the "Homework & Coursework Questions" topic. So: there are many ways of doing this, but one method that a lot of people tend to choose in these situations is to use the awk command. One of the things awk can do quite easily and quickly is pick out parts of its input and print them in whatever format is desired as its output. It can do a lot more than that as it happens - it's an entire programming language all to itself - but this is definitely one of the most common tasks for which it's typically used.

So, I'd encourage you to search for previous threads here on this forum regarding printing multiple columns with awk, as this is a topic that has come up a great deal over the years. I'll give you a few quick examples however.

Imagine we have the following input file:

$ cat fruits.txt 
apple orange strawberry banana
raspberry blackberry kiwi tangerine
blueberry lemon grapefruit melon
$ 

Now, let's say we want to print the second field of each line in this file. We can do that easily with awk as follows:

$ awk '{print $2}' fruits.txt 
orange
blackberry
lemon
$ 

Now let's imagine we've been asked to print the first and fourth fields, with a space between them. Here's how we do that:

$ awk '{print $1,$4}' fruits.txt
apple banana
raspberry tangerine
blueberry melon
$ 

Lastly, let's see how we add a bit more formatting to our output. Let's say we've been asked to print a report stating "The first fruit on this line is X" for the first fruit on each line. Here's how we do that:

$ awk '{print "The first fruit on this line is",$1}' fruits.txt 
The first fruit on this line is apple
The first fruit on this line is raspberry
The first fruit on this line is blueberry
$ 

So, that's a quick summary of how to do the most basic kind of field selecting and output formatting with awk.

Now, one question for you to consider and research the answer to: how do you handle printing multiple fields when they're not separated by spaces, but with commas, as they are in your input ? There must be a way to specify a different field separator - and indeed there is. Have a go, and see what you can come up with.

Hope this helps !

1 Like

A problem is the embedded comma: a simple split in awk or bash treats it as a field separator.
Well, if there is always an embedded comma in a certain field then you can simply use a +1 to address the following fields.

1 Like

Thank you so much i will try that.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.