Ygor, I'm directing this question towards you, but anyone else that could clearly explain, (you don't have to dumb it down terribly, but enough to make sense) but I'm getting caught up in understanding how the command is interpreted towards the end of this command:
sed 's/\(.*\)==-=-.*\(\.\)/\1\2/'
I guess it'd be best if I tell you what I'm seeing and what I understand:
sed - I know what the command is to do with associated flags, it's the streamlines editor.
's - I know this is the substitution flag.
\( - I understand this is used to make the character after it be the "literal" character you see. Thus ( means ( as you see it.
.* - I see Ygor that you stated that .* means any number of characters, but I get slightly confused here. From my understanding, the * character is a wildcard character and the . character is only 1 wildcard character. Does .* in "regular expression" sed terms translate into "Any number of characters?"
\) - This means ) as you see it, like the ( pattern above.
==-=- - I understand that pattern is the literal pattern.
.* - This appears again and does this mean the same thing as the first one? I'm getting confused that since the literal string is immediately before the ., the . will be interpreted literally instead of "any number of characters."
\( - Once again escaping out the character to literally mean (.
\. - I came up with this piece and it seems to work in grabbing the .pdf or .txt extensions, but to be honest I'm unsure why it's working. I thought the . character would be interpreted as 1 wildcard character. Instead it is escaped out, if I'm interpreting correctly, and it is taking the literal . character as to what the second pattern its looking for.
\)\1\2/' - I understand the escaped characters and how it fits the patterns together.
A slight explanation on the couple of spots would be greatly appreciated. I like getting the answers, don't get me wrong, but i like to take that one step further and understand the inner workings. It's how you truly learn a command...
Here are some examples of output to see how this is working out:
ls -l
total 6
-rw-r----- 1 root root 0 Mar 19 20:22 cconvey=acnastatusz+23423==-=-2340289723423089724.txt
-rw-r----- 1 root root 17 Mar 19 19:11 cconveyancestatusg5q0aCC1JK-aBRIok8L+jg==-=-43766338.pdf
-rw-r----- 1 root root 18 Mar 19 19:11 cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q==-=-48489900.pdf
-rw-r----- 1 root root 19 Mar 19 19:12 cconveyancestatusz+45hkPLw9xe78iTNMrNwQ==-=-22077524.pdf
Above you see the list of example files in the directory.
ls | sed 's/\(.*\)==-=-.*\(\.pdf\)/\1\2/'
cconvey=acnastatusz+23423==-=-2340289723423089724.txt
cconveyancestatusg5q0aCC1JK-aBRIok8L+jg.pdf
cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q.pdf
cconveyancestatusz+45hkPLw9xe78iTNMrNwQ.pdf
This is using Ygor's first example. It works on the .pdf files but the .txt files are excluded. I then went to work to try and find out a way to have any characters included at the end (pdf and txt are good, but there are some greaterthan3 character extensions out there.)
ls | sed 's/\(.*\)==-=-.*\(\.*\)/\1\2/'
cconvey=acnastatusz+23423
cconveyancestatusg5q0aCC1JK-aBRIok8L+jg
cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q
cconveyancestatusz+45hkPLw9xe78iTNMrNwQ
The third .* was put in because it would stand for Any Number of Characters. As you can see above, it only returns the first part, so I knew I had done something wrong.
ls | sed 's/\(.*\)==-=-.*\(\...\)/\1\2/'
cconvey=acnastatusz+23423.txt
cconveyancestatusg5q0aCC1JK-aBRIok8L+jg.pdf
cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q.pdf
cconveyancestatusz+45hkPLw9xe78iTNMrNwQ.pdf
I used 3 ... to make it grab those 3 characters, whatever they may be. But it still didn't resolve the problem of what if there are extensions greater than 3 characters?
ls | sed 's/\(.*\)==-=-.*\(\....\)/\1\2/'
cconvey=acnastatusz+23423.txt
cconveyancestatusg5q0aCC1JK-aBRIok8L+jg.pdf
cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q.pdf
cconveyancestatusz+45hkPLw9xe78iTNMrNwQ.pdf
I added a 4th ., and it worked, but it seemed like I took the easy way around it, sort of a cheesy way to counter the problem I was having. This led me to my final try at it:
ls | sed 's/\(.*\)==-=-.*\(\.\)/\1\2/'
cconvey=acnastatusz+23423.txt
cconveyancestatusg5q0aCC1JK-aBRIok8L+jg.pdf
cconveyancestatuskYMXtXkxtren0pSQ-l7J+Q.pdf
cconveyancestatusz+45hkPLw9xe78iTNMrNwQ.pdf
I accidentally got rid of 3 ... and hit enter and I received the correct end result. My questions about how it works is in the above, but these were the examples I tried to get it correct. I hope you see where my logic was going when trying to get the answer. Thanks again for your patience and help with this.
~Ryan