copy certain spaces from a file
I have a file which looks like this
18DMA H 9996 0.886 5.687 5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185
The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
text-processing
add a comment |
I have a file which looks like this
18DMA H 9996 0.886 5.687 5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185
The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
text-processing
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05
add a comment |
I have a file which looks like this
18DMA H 9996 0.886 5.687 5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185
The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
text-processing
I have a file which looks like this
18DMA H 9996 0.886 5.687 5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185
The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
text-processing
text-processing
edited Dec 23 '18 at 13:11
Kusalananda
128k16241398
128k16241398
asked Dec 23 '18 at 12:56
Dimitris MintisDimitris Mintis
805
805
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05
add a comment |
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05
add a comment |
5 Answers
5
active
oldest
votes
Use cut
in character mode:
cut -c1-15,21-
you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t
characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).
If there are tabs, then the expand
program can convert them to spaces.
add a comment |
Assuming you don't have <TAB>
s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with
$ sed -E 's/^(.{15}).{5}/1/' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
It's using a "back reference" for the first 15 characters to restore them using 1
in the replacement part of the s
ubstitute command.
add a comment |
$ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
This short awk
program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.
If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.
The output will be tab-delimited (or delimited by whatever you set OFS
to on the command line).
add a comment |
What about using vim
?
vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file
this regex
in the vim command
finds the exact pattern, deletes them and saves the file as file1 and quits vim
. Your desired formatted things are now in file1
.
See, vim
is ultimately poor man's sed
,awk
,perl -e 's/.../'
,tr
,cut
and many more altogether.
NB: This will also work with vi
. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.
add a comment |
If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:
awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The $0=$0
assignment will cause awk
to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.
A version that will use Tab as the output field separator:
awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The extra sub(OFS OFS, OFS)
will collapse the empty field created by $3=""
. That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a<pre>
); how are people able to post makefiles and diffs here?
– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
|
show 1 more comment
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490597%2fcopy-certain-spaces-from-a-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use cut
in character mode:
cut -c1-15,21-
you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t
characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).
If there are tabs, then the expand
program can convert them to spaces.
add a comment |
Use cut
in character mode:
cut -c1-15,21-
you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t
characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).
If there are tabs, then the expand
program can convert them to spaces.
add a comment |
Use cut
in character mode:
cut -c1-15,21-
you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t
characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).
If there are tabs, then the expand
program can convert them to spaces.
Use cut
in character mode:
cut -c1-15,21-
you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t
characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).
If there are tabs, then the expand
program can convert them to spaces.
edited Dec 24 '18 at 11:16
Volker Siegel
10.9k33260
10.9k33260
answered Dec 23 '18 at 13:10
peterphpeterph
23.5k24457
23.5k24457
add a comment |
add a comment |
Assuming you don't have <TAB>
s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with
$ sed -E 's/^(.{15}).{5}/1/' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
It's using a "back reference" for the first 15 characters to restore them using 1
in the replacement part of the s
ubstitute command.
add a comment |
Assuming you don't have <TAB>
s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with
$ sed -E 's/^(.{15}).{5}/1/' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
It's using a "back reference" for the first 15 characters to restore them using 1
in the replacement part of the s
ubstitute command.
add a comment |
Assuming you don't have <TAB>
s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with
$ sed -E 's/^(.{15}).{5}/1/' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
It's using a "back reference" for the first 15 characters to restore them using 1
in the replacement part of the s
ubstitute command.
Assuming you don't have <TAB>
s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with
$ sed -E 's/^(.{15}).{5}/1/' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
It's using a "back reference" for the first 15 characters to restore them using 1
in the replacement part of the s
ubstitute command.
answered Dec 23 '18 at 13:07
RudiCRudiC
4,2441312
4,2441312
add a comment |
add a comment |
$ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
This short awk
program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.
If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.
The output will be tab-delimited (or delimited by whatever you set OFS
to on the command line).
add a comment |
$ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
This short awk
program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.
If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.
The output will be tab-delimited (or delimited by whatever you set OFS
to on the command line).
add a comment |
$ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
This short awk
program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.
If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.
The output will be tab-delimited (or delimited by whatever you set OFS
to on the command line).
$ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
This short awk
program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.
If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.
The output will be tab-delimited (or delimited by whatever you set OFS
to on the command line).
edited Dec 23 '18 at 13:15
answered Dec 23 '18 at 13:09
KusalanandaKusalananda
128k16241398
128k16241398
add a comment |
add a comment |
What about using vim
?
vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file
this regex
in the vim command
finds the exact pattern, deletes them and saves the file as file1 and quits vim
. Your desired formatted things are now in file1
.
See, vim
is ultimately poor man's sed
,awk
,perl -e 's/.../'
,tr
,cut
and many more altogether.
NB: This will also work with vi
. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.
add a comment |
What about using vim
?
vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file
this regex
in the vim command
finds the exact pattern, deletes them and saves the file as file1 and quits vim
. Your desired formatted things are now in file1
.
See, vim
is ultimately poor man's sed
,awk
,perl -e 's/.../'
,tr
,cut
and many more altogether.
NB: This will also work with vi
. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.
add a comment |
What about using vim
?
vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file
this regex
in the vim command
finds the exact pattern, deletes them and saves the file as file1 and quits vim
. Your desired formatted things are now in file1
.
See, vim
is ultimately poor man's sed
,awk
,perl -e 's/.../'
,tr
,cut
and many more altogether.
NB: This will also work with vi
. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.
What about using vim
?
vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file
this regex
in the vim command
finds the exact pattern, deletes them and saves the file as file1 and quits vim
. Your desired formatted things are now in file1
.
See, vim
is ultimately poor man's sed
,awk
,perl -e 's/.../'
,tr
,cut
and many more altogether.
NB: This will also work with vi
. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.
edited Dec 24 '18 at 8:34
answered Dec 24 '18 at 8:28
Ritajit KunduRitajit Kundu
857
857
add a comment |
add a comment |
If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:
awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The $0=$0
assignment will cause awk
to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.
A version that will use Tab as the output field separator:
awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The extra sub(OFS OFS, OFS)
will collapse the empty field created by $3=""
. That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a<pre>
); how are people able to post makefiles and diffs here?
– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
|
show 1 more comment
If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:
awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The $0=$0
assignment will cause awk
to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.
A version that will use Tab as the output field separator:
awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The extra sub(OFS OFS, OFS)
will collapse the empty field created by $3=""
. That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a<pre>
); how are people able to post makefiles and diffs here?
– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
|
show 1 more comment
If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:
awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The $0=$0
assignment will cause awk
to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.
A version that will use Tab as the output field separator:
awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The extra sub(OFS OFS, OFS)
will collapse the empty field created by $3=""
. That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.
If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:
awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The $0=$0
assignment will cause awk
to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.
A version that will use Tab as the output field separator:
awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file
18DMA H 0.886 5.687 5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185
The extra sub(OFS OFS, OFS)
will collapse the empty field created by $3=""
. That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.
edited Dec 24 '18 at 7:29
answered Dec 24 '18 at 6:45
Uncle BillyUncle Billy
4205
4205
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a<pre>
); how are people able to post makefiles and diffs here?
– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
|
show 1 more comment
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a<pre>
); how are people able to post makefiles and diffs here?
– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.
– Kusalananda
Dec 24 '18 at 6:51
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
@Kusalananda added tab delimited version
– Uncle Billy
Dec 24 '18 at 7:08
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a
<pre>
); how are people able to post makefiles and diffs here?– Uncle Billy
Dec 24 '18 at 8:30
For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a
<pre>
); how are people able to post makefiles and diffs here?– Uncle Billy
Dec 24 '18 at 8:30
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).
– Kusalananda
Dec 24 '18 at 9:44
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
So not being able to exchange accurate diff(1)s is a feature now? LOL.
– Uncle Billy
Dec 24 '18 at 10:45
|
show 1 more comment
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490597%2fcopy-certain-spaces-from-a-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
clarify your atom name and number
– RomanPerekhrest
Dec 23 '18 at 13:01
My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work
– Dimitris Mintis
Dec 23 '18 at 13:05