Shift Question (Perl)

I am attempting to write a script that reads each line of a file into a separate array and does some work on it then puts it all back together and I think I need to use the 'shift()' command to read each line into its own array, but I need help nesting it into a while loop (while not eof)

So here is some of the raw data in the file:

Vallejo-1991-Jan-20-The_Bride-BLH
Vallejo-1991-Jan-20-The_Bride-BLH
Berkeley-1992-Jan-26-I_Corinth_14-BLH
berkeley-1992-Jan-26-I_Corinth_14-BLH
Union City-1991-July-14-Promises_covenent_of_circumcision-BLH
UC-1991-July-14-Promises_covenent_of_circumcision-BLH

I want the output to replace the Location (Vallejo, etc.) with a consistent syntax (Capital first letter and city spelled out; e.g. Union City, Vallejo, Berkeley, there are only these three locations) AND, more importantly I want to change the Month element into a number (e.g July - 07, Jan -01, unfortunately the abbreviations aren't always consistent, BUT the first three letters usually are (Jul, Jan, Aug, etc.)

So the output of the data above should be:

Vallejo-1991-01-20-The_Bride-BLH
Vallejo-1991-01-20-The_Bride-BLH
Berkeley-1992-01-26-I_Corinth_14-BLH
Berkeley-1992-01-26-I_Corinth_14-BLH
Union City-1991-07-14-Promises_covenent_of_circumcision-BLH
Union City-1991-07-14-Promises_covenent_of_circumcision-BLH

here is what I have so far:

# Open File containing raw data
open(FILE, "test2.txt") or die("unable to open file");

# read file into an array
@RawData = <FILE>;
close(FILE);
while (<>) {

#read in Folder name
@FileNames = shift(@RawData);

I know how to write the code to split the data into arrays BUT I dont know how to analyze and replace the data...

You are somehow trying to read the same file twice. This line reads the ENTIRE file into the @RawData array:

@RawData = <FILE>;

Then you try to start reading from files listed on the command line (or standard input) with

while (<>) {
  1. Maybe you mean:
open(FILE, "test2.txt") or die("...");
while (<FILE>) { 
   # process one line in $_ at a time...
}

I think that's what you want.

  1. To process: use split()
# inside while loop
($city,$year,$month,$day,$title,$speaker)=split('-'); 

# Convert month name to a number, or print existing value if not found in mapping.
$month=exists $month2int{$month} ? $month2int{$month} : $month;

# Canonicalize city name: use the value found in the map; if not in the map, just capitalize first letter. 
$city=exists $citymap{$city} ? $citymap{$city} : ucfirst($city);

# reconstruct and print out line.
print join("-",$city,$year,$month,$day,$title,$speaker);
  1. Define your "mappings" for month names and cities. Do this before the while loop. Fill in the ellipses with the rest of the information you'll need....
%citymap = ( UC => "Union City", VJ => "Vallejo", ... );
%month2int  = ( Jan => 1, Feb => 2, ...., Jul => 7, July => 7, ... Okt => 10, Oct => 10, October => 10, ... );

Or maybe you mean:

@RawData = <FILE>
while ($_ = shift @RawData) {
...
}
## @RawData is empty

But in that case, it's simpler, better, and faster to use:

@RawData = <FILE>
foreach (@RawData) {
 ...
}
## @RawData contains processed data.

Okay I'm getting a syntax error here is my code

open(FILE, "test2.txt") or die ("The file is not found");

$citymap = ( UC => "Union City", VJ => "Vallejo", Vallejo => "Vallejo", Union Ci
ty => "Union City", berk => "Berkeley", Berk => "Berkeley" );
$month2int  = ( Jan => 1, Feb => 2, Mar => 3, mar => 3, March => 3, Apr => 4, ap
r => 4, April => 4, may => 5, May => 5, Jun => 6, Jul => 7, July => 7, jul => 7,
 aug => 8, Aug => 8, August => 8, august => 8, Sept => 9, September => 9, sept =
> 9,Okt => 10, Oct => 10, October => 10, oct => 10, nov => 11, Nov => 11, Novemb
er => 11, november => 11, Dec => 12, December => 12, december => 12);

while (<FILE>) {
($city,$year,$month,$day,$title,$speaker)=split('-');
$month=exists $month2int{$month} ? $month2int{$month} : $month;
$city=exists $citymap{$city} ? $citymap{$city} : ucfirst($city);
print join("-",$city,$year,$month,$day,$title,$speaker);
}

I run the command:

perl ConvertingFileNames

and my error is:

syntax error at ConvertingFileNames line 3, near "Union City"
Execution of ConvertingFileNames aborted due to compilation errors.

I noticed in the code you gave me you had a prefix of "%" instead of "$" for the variables defining the month and Location...I tried running with both and it gives me the same error

So I like to talk my code out in words correct me if I'm wrong here but this is what I have so far:

  1. Open the file with the file names
  2. Define variables to compare the file to
  3. While not eof take each line of the file and split it by "-" and put each split value into a unique variable
  4. Check two of the variables (month and location) against the two variables defined before the "while" statement
  5. make the appropriate changes or do nothing if nothing matches
  6. print each line of the file back in the same order it was found with the appropriate changes
  7. Close file

Can you help me with syntax?

The problem is where you have:

Union City => "Union City",

The space is throwing off perl. But you DONT NEED to map every city name... if it's not in the hash array, the script will simply capitalize the first letter and take the rest. So:

berkeley => 'Berkeley', Vallejo => 'Vallejo'

are also not needed. However this IS needed (just in case):

"union city" => "Union City"

Otherwise you will get "Union city".

Okay I got the script working...Thank you VERY much for your help...one more thing though
it minor dont worry...

I want the output to be as follows

YEAR-MN-DY-YR- Subject -BLH
so there should be a space before and after the subject
also I need the Month to be two digits
I tried altering the code by putting Feb => 02 instead of Feb => 2 but it errors out
I also tried changing the print line to ...,0$month,...or ...,'0' month,... that errors out as well

can you give me some guidance on syntax

#!/usr/bin/perl

open(FILE, "test3.txt") or die ("The file is not found");

%citymap = ( UC => "Union City", VJ => "Vallejo", Vallejo => "Vallejo",berk => "
Berkeley", Berk => "Berkeley" );
%month2int = ( Jan => 1, Feb => 2, Mar => 3, mar => 3, March => 3, Apr => 4, ap
r => 4, April => 4, may => 5, May => 5, Jun => 6, Jul => 7, July => 7, jul => 7,
aug => 8, Aug => 8, August => 8, august => 8, Sept => 9, September => 9, sept =
> 9,Okt => 10, Oct => 10, October => 10, oct => 10, nov => 11, Nov => 11, Novemb
er => 11, november => 11, Dec => 12, December => 12, december => 12);

while (<FILE>) {
($city,$year,$month,$day,$title,$speaker)=split('-');
$month=exists $month2int{$month} ? $month2int{$month} : $month;
$city=exists $citymap{$city} ? $citymap{$city} : ucfirst($city);
print join("- ",$city,$year,$month,$day,$title,$speaker);

nvrmind...I just discovered "printf" :slight_smile: :slight_smile:

In the hash table, you should still quote the numbers, or remove the leading 0's . Otherwise, perl will interpret these as octal numbers, meaning "08" will end up being "10".

I dont know if I should start a new thread for this as it is contained within the same code that I'm working on in this thread but anyway I have another question

I'm trying to make a series of directories with the newly created names that I have generated...here is the loop that I have, and it doesn't seem to be working:

}
open(FOLDER, "temp.txt");
while ($test = <FOLDER>) {
#print $test;
mkdir('$test',0777) || print $!;
}

For some reason this code keeps generating code with ints instead of the actual chars contained within the file, perhaps its trying to count each line? I'm still very hazy on how to read files into vars line by line. Can anyone enlighten me>?

Remove the single quotes around $test inside the mkdir call, and it should work fine.

amend below %hash accordingly should help you some.

%hash=(Jan,"01",Jul,"07",UC,"Union City");
open FH,"<a.txt";
while(<FH>){
	@arr=split("-",$_);
	$arr[0]= ($hash{$arr[0]})?$hash{$arr[0]}:ucfirst($arr[0]);
	$arr[2]= $hash{substr($arr[2],0,3)};
	print join "-",@arr;
}