I'm trying to make a perl script using the "open" command to open and read a file, storing the information in said file into a hash structure.
This is what is inside my file-
Celena Standard F 01/24/94 Cancer
Jeniffer Orlowski F 06/24/86 None
Brent Koehler M 12/05/97 HIV
Mao Schleich M 04/17/60 Cancer
Goldie Moultrie F 04/05/96 None
Silva Rizzo F 10/26/78 Amyloidosis
Leatha Papenfuss F 10/15/97 CREST
Vita Sabb F 05/28/87 Autism
Alyce Ugarte F 12/21/64 HIV
Ela Prout F 12/05/57 Autism
Mohamed Buchannon M 07/24/91 Caner
Lael Stall M 12/05/97 None
The first column is a name, the second is gender, third is birthdate, fourth is disease. The name is supposed to be the key while the other three columns are the values.
Also how would I allow the user to change information and output information to another file?
Since the "columns" in your file seem to be separated by one or more spaces, how do you know where the name "column" ends and the gender "column" starts? If more than one disease is associated with a name, does that add more <space>s to the last "column" in your file? If a disease has more than one word (e.g., diabetes mellitus or mitral valve prolapse), how are diseases separated from each other in the last "column"?
What have you tried to solve this problem on your own?
However, depending of the real input, that might have a serious flaw. Name plus last name is not unique enough. There is the strong possibility that two or more entries might contain the same name last-name record even when the data would mean different people. Translation: you loose data, since a hash will keep only the last read.
Adding the birthday to the id might help to prevent that. Here's a modification of the previous code, using a modified input to prove handling of name collision and multi-word decease:
INPUT:
$ cat name.list
Celena Standard F 01/24/94 Cancer
Jeniffer Orlowski F 06/24/86 None
Brent Koehler M 12/05/97 HIV
Mao Schleich M 04/17/60 Cancer
Goldie Moultrie F 04/05/96 None
Silva Rizzo F 10/26/78 Amyloidosis
Leatha Papenfuss F 10/15/97 CREST
Vita Sabb F 05/28/87 Autism
Alyce Ugarte F 12/21/64 HIV
Ela Prout F 12/05/57 Autism
Silva Rizzo F 22/5/81 Dissociative Indentity Disorder
Mohamed Buchannon M 07/24/91 Caner
Lael Stall M 12/05/97 None
$ cat read_names.pl
#!/usr/bin/perl
#
use strict;
use warnings;
use Data::Dumper;
my %patient;
while(<>) {
my @record = split;
$patient{"@record[0..1,3]"} = {
'gender' => "$record[2]",
'birthday' => "$record[3]",
'disease' => "@record[4..$#record]",
}
}
print Dumper \%patient;
Note:
The code assumes that the patient will always be name and last-name and not a variation like name alone or name, middle name, last-name, etc...
Once you decide and practice with extracting the data based on actual data, you could show your effort on it and follow up with your second question.
By the way, I hope the example does not contain real people's names and birthdays that you happen to be trusted with. That would be a 'terrible' thing to post.
Don't worry, the info I posted aren't real people. I'll try out your code in a bit Aia, thank you for the examples. Is there no way for me to accomplish this task of mine with the open commend though? As in something like-
Here's an example how you might be able to open files to read and to write.
Open the patient file, search for cancer records and write the result to another file.
$ cat read_and_write_names.pl
#!/usr/bin/perl
#
use strict;
use warnings;
my $patient_names = 'patient.list';
#
# open patient.lit to read or exit
#
open my $in_file, '<', $patient_names or die "Could not open file $patient_names: $!\n";
#
# structure the database
#
my %patients;
while(<$in_file>) {
my @record = split;
$patients{"@record[0..1,3]"} = {
'name' => "$record[0]",
'lastname' => "$record[1]",
'gender' => "$record[2]",
'birthday' => "$record[3]",
'disease' => "@record[4..$#record]",
}
}
close $in_file;
#
# to reassemble the original order of fields
#
my @fields = qw(name lastname gender birthday disease);
#
# open new processed list of names or exit
#
my $new_patient_names = 'processed_names.list';
open my $out_file, '>', $new_patient_names or die "Could not open file $new_patient_names: $!\n";
#
# save only patients with cancer. Since the word Cancer can be found misspelled as Caner
# here's the opportunity how to handle misspells as well.
#
for my $record (keys %patients) {
if ($patients{$record}{'disease'} =~ /^Canc?er/i) {
print $out_file join (" ", @{$patients{$record}}{@fields}) . "\n";
}
}
close $out_file;
Output:
$ cat processed_names.list
Mohamed Buchannon M 07/24/91 Caner
Mao Schleich M 04/17/60 Cancer
Celena Standard F 01/24/94 Cancer