sed:
the \1 in the substitution returns what is matched within the \( \) in the search.
The search must end with .* , so the rest of the line is discarded.
awk:
by using a clever set of field delimiters (" " and "=") the input line is split into "WORD1", "code1", "value1-idkey1", "code2", "value1"
They can be referenced as $1, $2, $3, $4, $5, respectively.
Thank you every body for your help.
The simple test I made with sed in shell script work well from your recomendations.
Now I have written a small piece of code using Qt Creator.
My question is about regular expression not Qt.
In term of regular expression, what is the meaning of :
"code=(.+)\\s"
I think that my problem arise because I must tell (don't know how !!) that the end of the word is the first encountered blank, not the last one if there are more than one.
As you know my purpose is to extract some character code. For that I am using this code :
I haven't used Qt, but I believe it's: code= - the literal code= , followed by (.+) - one or more of any character as bracket expression #1, followed by \\s - whitespace (normally just \s , but I assume it's escaped for C++ strings)
The bracket expression is so you can extract that matched part afterwards (presumably by match.catured(1) ).
EDIT: Your regex is matching too much - a regular expression will consume as much of the string as it can. In this case, that's anything from code= up to (but excluding) the last word on the line.
Rather than using (.+) to match any character, try using ([^\\s]+) to match non-whitespace characters (as in the earlier sed suggestions).