Script to compact information !

Hello Group,

Could you please help me with a shell script that condense in this case each 12 rows of hour information in only one row per day as following:

Final Fields in each row
1.Ticker Name
2. D (Daily)
3. Date
4. Open Price (The open price of the first record of the day)
5. High Price (The high price of all 12 rows of each day)
6. Low Price (The low price of all 12 rows of each day)
7. Close (The close price of the last record of the day)
8. Volume (The sum of all volume fields per day)

Final Result of the rows:
<ticker>,<per>,<date>,<open>,<high>,<low>,<close>,<vol>
EUR_USD,D,11/02/2009,1.4783,1.4844,1.4727,1.4772,11633
EUR_USD,D,11/03/2009,1.4732,1.4631,1.4631,1.4654,12887

The information that I want to process is in the following format:

<ticker>,<per>,<date>,<time>,<open>,<high>,<low>,<close>,<vol>
EUR_USD,I,11/02/2009,12:00,1.4783,1.4784,1.4766,1.4774,824
EUR_USD,I,11/02/2009,13:00,1.4773,1.4775,1.4746,1.4754,854
EUR_USD,I,11/02/2009,14:00,1.4753,1.4779,1.4739,1.4777,1320
EUR_USD,I,11/02/2009,15:00,1.4778,1.4844,1.4778,1.4817,2389
EUR_USD,I,11/02/2009,16:00,1.4816,1.4834,1.4813,1.4827,1302
EUR_USD,I,11/02/2009,17:00,1.4828,1.4831,1.4761,1.4762,1053
EUR_USD,I,11/02/2009,18:00,1.4761,1.4778,1.4731,1.4733,1148
EUR_USD,I,11/02/2009,19:00,1.4734,1.4771,1.4727,1.4751,1112
EUR_USD,I,11/02/2009,20:00,1.4752,1.4770,1.4747,1.4765,936
EUR_USD,I,11/02/2009,21:00,1.4764,1.4775,1.4763,1.4773,296
EUR_USD,I,11/02/2009,22:00,1.4772,1.4774,1.4760,1.4772,219
EUR_USD,I,11/02/2009,23:00,1.4773,1.4778,1.4769,1.4772,180
EUR_USD,I,11/03/2009,12:00,1.4635,1.4659,1.4631,1.4654,1207
EUR_USD,I,11/03/2009,13:00,1.4655,1.4668,1.4643,1.4651,1219
EUR_USD,I,11/03/2009,14:00,1.4652,1.4662,1.4633,1.4657,1690
EUR_USD,I,11/03/2009,15:00,1.4658,1.4696,1.4642,1.4676,2365
EUR_USD,I,11/03/2009,16:00,1.4675,1.4681,1.4631,1.4655,1592
EUR_USD,I,11/03/2009,17:00,1.4656,1.4695,1.4645,1.4685,1104
EUR_USD,I,11/03/2009,18:00,1.4686,1.4710,1.4678,1.4709,1006
EUR_USD,I,11/03/2009,19:00,1.4710,1.4732,1.4702,1.4702,988
EUR_USD,I,11/03/2009,20:00,1.4701,1.4722,1.4700,1.4711,723
EUR_USD,I,11/03/2009,21:00,1.4710,1.4732,1.4710,1.4721,371
EUR_USD,I,11/03/2009,22:00,1.4721,1.4727,1.4706,1.4716,395
EUR_USD,I,11/03/2009,23:00,1.4717,1.4732,1.4715,1.4731,227

I really appreciate your help and support

Sincerely,

Carlos S

It looks homework, I saw similar csv before.

So tell us, what did you try already?

while(<DATA>){
	my @tmp = split(",",$_);
	my $key = $tmp[0].",D,".$tmp[2];
	$hash{$key}->{open}=$tmp[4] if $hash{$key}->{open}=="";
	$hash{$key}->{low}=$tmp[6] if $hash{$key}->{low}=="";
	$hash{$key}->{high}=($hash{$key}->{high}<$tmp[5])?$tmp[5]:$hash{$key}->{high};
	$hash{$key}->{low}=($hash{$key}->{low}>$tmp[6])?$tmp[6]:$hash{$key}->{low};
	$hash{$key}->{close}=$tmp[7];
	$hash{$key}->{vol}+=$tmp[8];
}
foreach my $key(keys %hash){
	print $key,",",$hash{$key}->{open},",",$hash{$key}->{high},",",$hash{$key}->{low},",",$hash{$key}->{close},",",$hash{$key}->{vol},"\n";
}
__DATA__
EUR_USD,I,11/02/2009,12:00,1.4783,1.4784,1.4766,1.4774,824
EUR_USD,I,11/02/2009,13:00,1.4773,1.4775,1.4746,1.4754,854
EUR_USD,I,11/02/2009,14:00,1.4753,1.4779,1.4739,1.4777,1320
EUR_USD,I,11/02/2009,15:00,1.4778,1.4844,1.4778,1.4817,2389
EUR_USD,I,11/02/2009,16:00,1.4816,1.4834,1.4813,1.4827,1302
EUR_USD,I,11/02/2009,17:00,1.4828,1.4831,1.4761,1.4762,1053
EUR_USD,I,11/02/2009,18:00,1.4761,1.4778,1.4731,1.4733,1148
EUR_USD,I,11/02/2009,19:00,1.4734,1.4771,1.4727,1.4751,1112
EUR_USD,I,11/02/2009,20:00,1.4752,1.4770,1.4747,1.4765,936
EUR_USD,I,11/02/2009,21:00,1.4764,1.4775,1.4763,1.4773,296
EUR_USD,I,11/02/2009,22:00,1.4772,1.4774,1.4760,1.4772,219
EUR_USD,I,11/02/2009,23:00,1.4773,1.4778,1.4769,1.4772,180
EUR_USD,I,11/03/2009,12:00,1.4635,1.4659,1.4631,1.4654,1207
EUR_USD,I,11/03/2009,13:00,1.4655,1.4668,1.4643,1.4651,1219
EUR_USD,I,11/03/2009,14:00,1.4652,1.4662,1.4633,1.4657,1690
EUR_USD,I,11/03/2009,15:00,1.4658,1.4696,1.4642,1.4676,2365
EUR_USD,I,11/03/2009,16:00,1.4675,1.4681,1.4631,1.4655,1592
EUR_USD,I,11/03/2009,17:00,1.4656,1.4695,1.4645,1.4685,1104
EUR_USD,I,11/03/2009,18:00,1.4686,1.4710,1.4678,1.4709,1006
EUR_USD,I,11/03/2009,19:00,1.4710,1.4732,1.4702,1.4702,988
EUR_USD,I,11/03/2009,20:00,1.4701,1.4722,1.4700,1.4711,723
EUR_USD,I,11/03/2009,21:00,1.4710,1.4732,1.4710,1.4721,371
EUR_USD,I,11/03/2009,22:00,1.4721,1.4727,1.4706,1.4716,395
EUR_USD,I,11/03/2009,23:00,1.4717,1.4732,1.4715,1.4731,227

Hello rdcwayx,

The final purpose of transform the attached file is to use this data in order to take decisions of trade currencies. I want to convert or condense all rows of each day of the file in one row per day. My intention is to use Ubuntu and a shell script for this case.

Thanks for your prompt response.

Carlos

$ awk -F, '{
{date[$3]=$3}
{if (open[$3]=="") open[$3]=$5}
{if (high[$3]<$6) high[$3]=$6}
{if (low[$3]=="") low[$3]=$7}
{if (low[$3]>$7) low[$3]=$7}
{closep[$3]=$8}
{vol[$3]+=$9}
}
END {for (i in open) {print "EUR_USD,D",date,open,high,low,closep,vol}}
' OFS="," EUR_USD-1H.txt |sort -t, -k3r |head

EUR_USD,D,<date>,<open>,<high>,<low>,<close>,0
EUR_USD,D,11/04/2009,1.4764,1.4805,1.4740,1.4796,3046
EUR_USD,D,11/03/2009,1.4635,1.4732,1.4631,1.4731,12887
EUR_USD,D,11/02/2009,1.4783,1.4844,1.4727,1.4772,11633
EUR_USD,D,11/01/2009,1.4728,1.4741,1.4703,1.4708,1297
EUR_USD,D,10/30/2009,1.4804,1.4820,1.4711,1.4715,10863
EUR_USD,D,10/29/2009,1.4728,1.4858,1.4718,1.4838,10865
EUR_USD,D,10/28/2009,1.4779,1.4815,1.4690,1.4716,12978
EUR_USD,D,10/27/2009,1.4876,1.4887,1.4770,1.4821,12250
EUR_USD,D,10/26/2009,1.5024,1.5035,1.4844,1.4859,10336

Hello rdcwayx and Everybody.

I tried the awk script with the ascii attached file and working fine but the output is in incorrect date order.

Could you please help me for solve that the final output will in ascending date order like the original input file

Thanks in advance

Carlos

---------- Post updated at 04:17 PM ---------- Previous update was at 04:17 PM ----------

Hello rdcwayx and Everybody.

I tried the awk script with the ascii attached file and working fine but the output is in incorrect date order.

Could you please help me for solve that the final output will in ascending date order like the original input file

Thanks in advance

Carlos

change above command from:

sort -t, -k3r

to

sort -t, -k3

And no need to use head command.

Hello rdcwayx and Everybody,

How can I add a filter to the data for ignore the rows of Sundays before or after to process data with the shell script.

Thanks in advance.

Carlos

awk -F, '{
{date[$3]=$3}
{if (open[$3]=="") open[$3]=$5}
{if (high[$3]<$6) high[$3]=$6}
{if (low[$3]=="") low[$3]=$7}
{if (low[$3]>$7) low[$3]=$7}
{closep[$3]=$8}
{vol[$3]+=$9}
}
END {for (i in open) {print "EUR_USD,D",date,open,high,low,closep,vol}}
' OFS="," EUR_USD-1H.txt |sort -t, -k3r |awk -F, '"date -d "$3" +%a"|getline out {if (out!="Sun") print}'
1 Like

Hello rdcwayx and Everybody again,

I need your help with an additional filter in order to use another tool for make charts.

So I need to create other output file with a change in the date field format as follow:

From
EUR_USD-USA,D,04/29/2009,1.3248,1.3339,1.3222,1.3251,8694
EUR_USD-USA,D,04/30/2009,1.3231,1.3278,1.3190,1.3226,7753
EUR_USD-USA,D,05/01/2009,1.3261,1.3278,1.3238,1.3265,3959

To
EUR_USD-USA,D,20090429,1.3248,1.3339,1.3222,1.3251,8694
EUR_USD-USA,D,20090430,1.3231,1.3278,1.3190,1.3226,7753
EUR_USD-USA,D,20090501,1.3261,1.3278,1.3238,1.3265,3959

Thanks in advance.

Carlos

awk -F, 'split($3,a,"/") {print $1,$2,a[3]a[1]a[2],$4,$5,$6,$7,$8}' OFS=, urfile
1 Like
gawk -F"," '{ split($3,a,"\/") ; $3=a[3]a[1]a[2]}1' OFS="," file.txt

:D:D:D:D

Hello rdcwayx and Everybody,

The shell script of convert the date from MM/DD/YYYY to YYYYMMDD working fine but it does not work when the date field does not have the 8 characters.
For this kind of files I would like to complete the field date in all rows to 8 characteres. I know that the command date -d +%D could help me with this purpose.

SPY,D,11/2/2009,104.13,105.41,103.08,104.32,254222900
SPY,D,11/3/2009,103.74,104.8,103.54,104.65,228362600
SPY,D,11/4/2009,105.51,106.33,104.65,104.92,247996700
SPY,D,11/5/2009,105.66,106.88,105.44,106.85,180015300
SPY,D,11/6/2009,106.26,107.4,106.05,107.13,170954100
SPY,D,11/9/2009,107.95,109.63,107.87,109.57,159495700
SPY,D,11/10/2009,109.31,109.93,108.97,109.59,171899800
SPY,D,11/11/2009,110.31,110.82,109.62,110.15,169466200
SPY,D,11/12/2009,110,110.57,108.75,109.03,157144500
SPY,D,11/13/2009,109.31,110.09,108.75,109.62,150963000
SPY,D,11/16/2009,110.38,111.69,110.32,111.21,210922200
SPY,D,11/17/2009,110.92,111.39,110.5,111.34,147134100
SPY,D,11/18/2009,111.26,111.43,110.57,111.27,156486800
SPY,D,11/19/2009,110.51,111.31,109.13,109.82,208734600
SPY,D,11/20/2009,109.25,109.76,109.01,109.43,134196000
SPY,D,11/23/2009,110.72,111.74,110.6,110.82,148010200
SPY,D,11/24/2009,111,111.2,110.01,110.99,138420100
SPY,D,11/25/2009,111.17,111.5,110.82,111.38,109564800
SPY,D,11/27/2009,108.4,110.32,108.29,109.57,126001800
SPY,D,11/30/2009,109.48,110.2,108.12,109.94,160874800
SPY,D,12/1/2009,110.92,111.66,110.73,111.3,159613700
SPY,D,12/2/2009,111.28,112.01,110.92,111.25,131331000

An example of the final output of the file could be

SPY,D,20091201,110.92,111.66,110.73,111.3,159613700
SPY,D,20091202,111.28,112.01,110.92,111.25,131331000

I really appreciate your help with a script for this issue

Carlos

use below:-

gawk -F"," '{ split($3,a,"\/") ; if ( a[2] < 10) {$3=a[3]a[1]"0"a[2]}
else {$3=a[3]a[1]a[2] }}1' OFS="," file.txt

:D:D:D:D

1 Like