Transfer data from log file to excel

rainbow1 · November 16, 2021, 7:25am

Hello everyone! I have a log file which contains data as below.

#1
Name: 
Receive time:
Transmit time:

#2
Name: 
Receive time:
Transmit time:

#3
Name: 
Receive time:
Transmit time:
....

#nth
Name: 
Receive time:
Transmit time:

I would like to get the details below into each column of excel file with headers.

#nth(get the value here)
Name: 
Receive time:
Transmit time:

With little knowledge, I could only come up with this

#!/bin/bash
CWD=$(pwd)


	for j in {1..5} # number of iterations

	do 

grep  "Name" test_$j.txt >> output.csv
	done

But it is not working. May i get some help on this?

drysdalk · November 16, 2021, 9:31am

Hello,

Welcome to the forum ! We hope you find this a friendly and helpful place. One thing I'd like to clarify here: would I be right in thinking that it's a CSV file you want to make, rather than an Excel XLS/XLSX file ? It looks like that from the output filename you are using in your script, but your description states you want to generate an Excel file. In a CSV the columns and their values are separated by commas (indeed, that's what CSV stands for - Comma Separated Values). So your file format would look more like this:

name,receive,transmit
foo,4,2
bar,8,3

and so on. So the first line defines the column names, and then the subsequent line contains the values for those columns. So each individual line after the first is a list of the separate fields of the data that you want to produce.

In terms of how to get the data from your input into the output: as things stand, your script will do the same thing each time it passes through the loop, and that is to search the whole of the file test_$j.txt (so presumably you have multiple files called test_1.txt, test_2.txt and so on) for all lines containing occurrences of the exact string "Name" (bear in mind that by default grep is case-sensitive), and append that to a file called output.csv.

So the end result of your script, as it stands, will be to search for all lines containing "Name" in your test_<number>.txt files, and write those lines into output.csv. That won't produce a CSV at all, but it should at least be capturing all the "Name" lines as things are at the moment.

To get this to produce these values as a CSV, you'd have to then further parse the line containing "Name" to print the second field, the one after the colon. You could do this a few ways, but the easiest would be with the cut command, or with awk (which could also do the searching for"Name" and the printing of the second field all in one operation).

Anyway, I hope this helps to explain what your script is doing, and why, and where you might want to look further from here. If you can clarify exactly what format you definitely want your output in (since at the moment it's producing neither an Excel file nor a CSV), then we can take things from there.

rainbow1 · November 16, 2021, 9:53am

I tried to use output file as .xlsx file but i was unable to open the file. So, I tried with CSV file.

munkeHoller · November 16, 2021, 11:18am

@rainbow1 can you provide some sample data from these files, the following command will give us some insight.

head -15 test_[1-5].txt >> assistance.txt

rainbow1 · November 16, 2021, 3:18pm

------------------------------ 
Test #1 
------------------------------ 
Timestamps #1 
------------------------------ 
TX Time                 : 0x0000000000060b7c02555b80 
RX Time                 : 0x0000000000060b7c0257a180 
------------------------------ 
Comparison #1 
------------------------------ 
RX_Time - TX_Time       : 0x24600/2.2734 ns

------------------------------ 
Test #2  
------------------------------ 
Timestamps #2 
------------------------------ 
TX Time                 : 0x0000000000060b7c026c6580 
RX Time                 : 0x0000000000060b7c026eab80 
------------------------------ 
Comparison #2 
------------------------------ 
RX_Time - TX_Time       : 0x24600/2.2734 ns

------------------------------ 
Test #3 
------------------------------ 
Timestamps #3 
------------------------------ 
TX Time                 : 0x0000000000060b7c026c6580 
RX Time                 : 0x0000000000060b7c026eab80 
------------------------------ 
Comparison #3 
------------------------------ 
RX_Time - TX_Time       : 0x24600/2.2734 ns

------------------------------ 
Test #4
------------------------------ 
Timestamps #4 
------------------------------ 
TX Time                 : 0x0000000000060b7c026c6580 
RX Time                 : 0x0000000000060b7c026eab80 
------------------------------ 
Comparison #4 
------------------------------ 
RX_Time - TX_Time       : 0x24600/2.2734 ns

------------------------------ 
Test #5  
------------------------------ 
Timestamps #5 
------------------------------ 
TX Time                 : 0x0000000000060b7c026c6580 
RX Time                 : 0x0000000000060b7c026eab80 
------------------------------ 
Comparison #5 
------------------------------ 
RX_Time - TX_Time       : 0x24600/2.2734 ns

munkeHoller · November 16, 2021, 3:46pm

@rainbow1 , thanks, i was expecting to see output similar to what you alluded to in your initial post,
ie

#nth
Name:
Receive time:
Transmit time:

so, now I am confused, can you correlate the actual output with this please.

as @drysdalk asked, is is a CSV you wish to create by extracting the relevant data , if so , what are the EXACT column names that will eventually end up in the excel sheet, if you can use some actual data from what you've shown that would make the task of helping you so much simpler for all.

rainbow1 · November 16, 2021, 4:13pm

Since im having some trouble with excel file, i will stick with csv file. Below is how the column names will end up in csv file.

Iteration	Name	TX Time	RX Time

Iteration : i have multiple files called test_1.txt, test_2.txt, test_3.txt.... so on, so iteration takes up the j value in test_$j.txt.

Name: Within the test_1.txt file, there will be more tests such as Test #1, Test #2, Test #3.....so on. Under the Name column, j value in Test #j should be returned.

TX Time and RX Time: The time value stated for TX and RX Time, ie

should be extracted.

Im so sorry if i failed to explain anything.

RudiC · November 16, 2021, 8:31pm

How far would this quick shot get you?

awk -F"[:\n]" -vRS= -vOFS="\t" 'BEGIN {print "Iteration\tName\tTX Time\tRX Time"} {print FILENAME, $2, $7, $9}' file[45]
Iteration	Name	TX Time	RX Time
file4	Test #1 	 0x0000000000060b7c02555b80 	 0x0000000000060b7c0257a180 
file4	Test #2  	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file4	Test #3 	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file4	Test #4	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file4	Test #5  	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file5	Test #1 	 0x0000000000060b7c02555b80 	 0x0000000000060b7c0257a180 
file5	Test #2  	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file5	Test #3 	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file5	Test #4	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80 
file5	Test #5  	 0x0000000000060b7c026c6580 	 0x0000000000060b7c026eab80

Replace input file[45] with your .txt files...

drysdalk · November 16, 2021, 9:32pm

Hello,

One thing I think might be helpful for you is a demonstration of the general principle of how to get data out of one file, and into CSV format. So, let's look at a little example.

Imagine we've been asked to produce a CSV file, detailing the stock levels of fruit that we currently have. The inventory is currently described in a text file, like so:

$ cat inventory.txt 
The number of red apples we have is 12
The number of yellow bananas we have is 3
The number of red strawberries we have is 11
The number of blue oranges we have is 0
$

Now, this format is consistent, which is important; it's a lot easier to extract data from a file if every record or line in that file is of the same format. In this case, we have all our data one per line, in the same order always, and in the same format. So, that's pretty much perfect for extracting into a CSV.

How do we do that, then ? Well, there are a variety of ways, but here we'll look at one of the most common methods selected in a situation like this, which is to use awk. The solution we're about to see isn't necessarily the most efficient in terms of lines of code or time taken, but it will certainly demonstrate the principle, which is what we're here to do.

At its simplest, then, we can get awk to print a specific numbered field with the print command. The syntax for doing so is as follows:

$ awk '{print $2}' inventory.txt 
number
number
number
number
$

Here, we asked awk to print the second field of every line in the file inventory.txt. By default, awk regards fields as being separated by white space. So in this case, the second field of every line is the same - the word "number". It does this for every line in turn, hence our getting four instances of hte word "number" returned.

So, to our solution. Let's say we want to get for each line in this file the name of the fruit, the colour, and the number in stock, in that order. Each field should be separated by commas. We could do that like so:

$ awk '{print $5","$4","$9}' inventory.txt 
apples,red,12
bananas,yellow,3
strawberries,red,11
oranges,blue,0
$

And there we go - by using awk and specifying the fields we wanted and the order in which we wanted them printed (and any other text we wanted to go between them, in this case, commas), we have our desired output.

Hope this helps ! If you have any further questions based on this, or indeed on any of the other responses you've had from others on this thread, please do let us know and we can take things from there.

rainbow1 · November 17, 2021, 3:13am

I added the output csv file for the data to be transferred there,

awk -F"[:\n]" -vRS= -vOFS="\t" 'BEGIN {print "Iteration\tName\tTX Time\tRX Time"} {print FILENAME, $2, $7, $9}' test_1.txt

Output in csv file:

"Iteration Name TX Time RX Time"
"test_1.txt Test #1 "
" 0x0000000000060b7c02555b80 "
" 0x0000000000060b7c0257a180 "

The required data did not appear respective columns. I will read up on how to separate them into respective column.

rainbow1 · November 17, 2021, 3:21am

Thank you so much for the detailed explanation.

Lets say the format is not consistent, how do we do that? Should i specify the terms instead of printing the specific numbered field?

munkeHoller · November 17, 2021, 5:32am

@rainbow1 , if the data is not consistent then you need to show that and more importantly share that information UP FRONT, not wait till time spent advising on potential solutions only to be told it won't work because of conditions/situations they are unaware of.

So far you have shown piece of code that obviously never worked against data that you had.
Additionally we have had to ask you to provide sample data - which transpires to look nothing like what you alleged.
Now you are saying that it may further differ in structure and again fail to provide sample or any information as to how that structure is inconsistent or provide any code that you may have written to deal with those inconsistencies. Until you actually describe the REAL challenge in full , provide REAL sample of ALL permutations of how data is stored that need to be dealt with AND the code you have written attempting extract the data then progress in assisting will stop as all we have is guesswork and that not helpful.

Also, can you confirm if this is COURSEWORK/HOMEWORK ?

Can you confirm that the sample provided is ACTUAL output from the command you were asked to execute

head -15 test_[1-5].txt >> assistance.txt

as it is missing the filenames used during the command, i would expect to see a structure such as shown below, which is not what you have provided.

==> byzise1.txt <==
noickOneshagAdlebLuakmuhospyivyifJuenhecirnArdoftOtJavDevusBogva
MevbacBommAmOdjadarpEynMaksidHerebEyctyijakJecholeecAdepvetnedAb

==> byzise2.txt <==
EkyedNeagmorjEnHedlihevRepcokaidiphnewsatDojchibkalkobfujcurabHi
FlolgyiryavvimyadHyidCejCenabirotnantheewtEtMulgyegyicyirrotujcy

==> byzise3.txt <==
sodcifotSuGajhabjedHuebojCeumUjpeunUrcOiskusyespeubipIthLofjagIk
dockienvuicaubTygBaqueitemfegBegdaquecyukOvWyenojFacelWonifOnUt=

==> byzise4.txt <==
ceHajEcyaucsyuofOrdEjIawOtAbrAdcatQuoodGesshogIldyendyaxooridFom
Jerv{EwQueycsEanadMacyemceucgaHamnadMydwigyobpamulfAypmallAcrogs

==> byzise5.txt <==
yurcImHauthmepusodPaxriadGhothyppErg`HycsimnojekexvenVoofWoidIj|
efpitPidlekvagFihotMunLogImvitobryikEarvyeepivwodkaishtAucivAch;

RudiC · November 17, 2021, 10:02am

Obviously your input differs from the sample you posted in #5...

drysdalk · November 17, 2021, 12:53pm

Hello,

I think that one thing that may be worth stepping back and looking at here is the idea of what "consistent input" actually means. In order for a task to be scriptable, there are three things that you absolutely have to be able to define:

The format of your input data
The operation you need to perform on that input in order to obtain your output
The format of your output data

As long as you can do all three of these things, then generally speaking any given task is scriptable. If, however, it is not possible to define a set of rules for these steps that is 100% applicable each and every time, then the ability to automate or script a task becomes extremely difficult at best, and perhaps impossible at worst.

That's not to say that your input always has to be simple, or has to be all on one line, or even has to be easily human-readable. As long as you can come up with a set of rules that you can ultimately turn into code, then any input can be parsed by your script.

But, fundamentally, we need to see a sample of the exact, real-world input that your script will be working with, and we need to know exactly what format you want your output to be in. Without knowing both these things precisely, it's not possible to arrive at a script that can help you do what you need to do.

Hope this helps !

system · January 16, 2022, 12:54pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.