copy certain spaces from a file












3















I have a file which looks like this



   18DMA      H 9996   0.886   5.687   5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185


The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be



   18DMA      H    0.886   5.687   5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185









share|improve this question

























  • clarify your atom name and number

    – RomanPerekhrest
    Dec 23 '18 at 13:01











  • My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

    – Dimitris Mintis
    Dec 23 '18 at 13:05
















3















I have a file which looks like this



   18DMA      H 9996   0.886   5.687   5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185


The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be



   18DMA      H    0.886   5.687   5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185









share|improve this question

























  • clarify your atom name and number

    – RomanPerekhrest
    Dec 23 '18 at 13:01











  • My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

    – Dimitris Mintis
    Dec 23 '18 at 13:05














3












3








3








I have a file which looks like this



   18DMA      H 9996   0.886   5.687   5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185


The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be



   18DMA      H    0.886   5.687   5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185









share|improve this question
















I have a file which looks like this



   18DMA      H 9996   0.886   5.687   5.320
18DMA H 9997 1.019 5.764 5.247
18DMA Np 9998 0.947 5.584 5.151
18DMA H 9999 1.033 5.541 5.113
18DMA Cn10000 0.880 5.674 5.050
18DMA H10001 0.831 5.616 4.971
18DMA H10002 0.814 5.751 5.091
18DMA H10003 0.957 5.735 5.003
18DMA Cn10004 0.837 5.486 5.185


The desire output is to delete column 3 however since from a certain row/line and next there is no a space between atom name and number I cannot make the deletion by column. Is there any way to make the deletion by selecting certain number of characters? The desire output should be



   18DMA      H    0.886   5.687   5.320
18DMA H 1.019 5.764 5.247
18DMA Np 0.947 5.584 5.151
18DMA H 1.033 5.541 5.113
18DMA Cn 0.880 5.674 5.050
18DMA H 0.831 5.616 4.971
18DMA H 0.814 5.751 5.091
18DMA H 0.957 5.735 5.003
18DMA Cn 0.837 5.486 5.185






text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 23 '18 at 13:11









Kusalananda

128k16241398




128k16241398










asked Dec 23 '18 at 12:56









Dimitris MintisDimitris Mintis

805




805













  • clarify your atom name and number

    – RomanPerekhrest
    Dec 23 '18 at 13:01











  • My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

    – Dimitris Mintis
    Dec 23 '18 at 13:05



















  • clarify your atom name and number

    – RomanPerekhrest
    Dec 23 '18 at 13:01











  • My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

    – Dimitris Mintis
    Dec 23 '18 at 13:05

















clarify your atom name and number

– RomanPerekhrest
Dec 23 '18 at 13:01





clarify your atom name and number

– RomanPerekhrest
Dec 23 '18 at 13:01













My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

– Dimitris Mintis
Dec 23 '18 at 13:05





My problem is in line 18DMA Cn10000 0.880 5.674 5.050 since there is no space between Cn and 1000 so I cannot proceed with copying the desire column. Somehow I need instead of copying a column to copy certain characters to make it work

– Dimitris Mintis
Dec 23 '18 at 13:05










5 Answers
5






active

oldest

votes


















8














Use cut in character mode:



cut -c1-15,21-


you may need to tweak the exact character numbers.
Again, this assumes the input doesn't use TABs (t characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).



If there are tabs, then the expand program can convert them to spaces.






share|improve this answer

































    5














    Assuming you don't have <TAB>s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with



      $ sed -E 's/^(.{15}).{5}/1/' file
    18DMA H 0.886 5.687 5.320
    18DMA H 1.019 5.764 5.247
    18DMA Np 0.947 5.584 5.151
    18DMA H 1.033 5.541 5.113
    18DMA Cn 0.880 5.674 5.050
    18DMA H 0.831 5.616 4.971
    18DMA H 0.814 5.751 5.091
    18DMA H 0.957 5.735 5.003
    18DMA Cn 0.837 5.486 5.185


    It's using a "back reference" for the first 15 characters to restore them using 1
    in the replacement part of the substitute command.






    share|improve this answer































      4














      $ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
      18DMA H 0.886 5.687 5.320
      18DMA H 1.019 5.764 5.247
      18DMA Np 0.947 5.584 5.151
      18DMA H 1.033 5.541 5.113
      18DMA Cn 0.880 5.674 5.050
      18DMA H 0.831 5.616 4.971
      18DMA H 0.814 5.751 5.091
      18DMA H 0.957 5.735 5.003
      18DMA Cn 0.837 5.486 5.185


      This short awk program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.



      If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.



      The output will be tab-delimited (or delimited by whatever you set OFS to on the command line).






      share|improve this answer

































        1














        What about using vim?



        vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file


        this regex in the vim command finds the exact pattern, deletes them and saves the file as file1 and quits vim. Your desired formatted things are now in file1.

        See, vim is ultimately poor man's sed,awk,perl -e 's/.../',tr,cut and many more altogether.



        NB: This will also work with vi. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.






        share|improve this answer

































          0














          If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:



          awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file

          18DMA H 0.886 5.687 5.320
          18DMA H 1.019 5.764 5.247
          18DMA Np 0.947 5.584 5.151
          18DMA H 1.033 5.541 5.113
          18DMA Cn 0.880 5.674 5.050
          18DMA H 0.831 5.616 4.971
          18DMA H 0.814 5.751 5.091
          18DMA H 0.957 5.735 5.003
          18DMA Cn 0.837 5.486 5.185


          The $0=$0 assignment will cause awk to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.



          A version that will use Tab as the output field separator:



          awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file

          18DMA H 0.886 5.687 5.320
          18DMA H 1.019 5.764 5.247
          18DMA Np 0.947 5.584 5.151
          18DMA H 1.033 5.541 5.113
          18DMA Cn 0.880 5.674 5.050
          18DMA H 0.831 5.616 4.971
          18DMA H 0.814 5.751 5.091
          18DMA H 0.957 5.735 5.003
          18DMA Cn 0.837 5.486 5.185


          The extra sub(OFS OFS, OFS) will collapse the empty field created by $3="". That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.






          share|improve this answer


























          • Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

            – Kusalananda
            Dec 24 '18 at 6:51











          • @Kusalananda added tab delimited version

            – Uncle Billy
            Dec 24 '18 at 7:08











          • For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

            – Uncle Billy
            Dec 24 '18 at 8:30













          • Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

            – Kusalananda
            Dec 24 '18 at 9:44











          • So not being able to exchange accurate diff(1)s is a feature now? LOL.

            – Uncle Billy
            Dec 24 '18 at 10:45











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490597%2fcopy-certain-spaces-from-a-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          5 Answers
          5






          active

          oldest

          votes








          5 Answers
          5






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          8














          Use cut in character mode:



          cut -c1-15,21-


          you may need to tweak the exact character numbers.
          Again, this assumes the input doesn't use TABs (t characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).



          If there are tabs, then the expand program can convert them to spaces.






          share|improve this answer






























            8














            Use cut in character mode:



            cut -c1-15,21-


            you may need to tweak the exact character numbers.
            Again, this assumes the input doesn't use TABs (t characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).



            If there are tabs, then the expand program can convert them to spaces.






            share|improve this answer




























              8












              8








              8







              Use cut in character mode:



              cut -c1-15,21-


              you may need to tweak the exact character numbers.
              Again, this assumes the input doesn't use TABs (t characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).



              If there are tabs, then the expand program can convert them to spaces.






              share|improve this answer















              Use cut in character mode:



              cut -c1-15,21-


              you may need to tweak the exact character numbers.
              Again, this assumes the input doesn't use TABs (t characters) as delimiters (which it probably doesn't, since then you wouldn't have the problem of the joined fields in the first place).



              If there are tabs, then the expand program can convert them to spaces.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Dec 24 '18 at 11:16









              Volker Siegel

              10.9k33260




              10.9k33260










              answered Dec 23 '18 at 13:10









              peterphpeterph

              23.5k24457




              23.5k24457

























                  5














                  Assuming you don't have <TAB>s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with



                    $ sed -E 's/^(.{15}).{5}/1/' file
                  18DMA H 0.886 5.687 5.320
                  18DMA H 1.019 5.764 5.247
                  18DMA Np 0.947 5.584 5.151
                  18DMA H 1.033 5.541 5.113
                  18DMA Cn 0.880 5.674 5.050
                  18DMA H 0.831 5.616 4.971
                  18DMA H 0.814 5.751 5.091
                  18DMA H 0.957 5.735 5.003
                  18DMA Cn 0.837 5.486 5.185


                  It's using a "back reference" for the first 15 characters to restore them using 1
                  in the replacement part of the substitute command.






                  share|improve this answer




























                    5














                    Assuming you don't have <TAB>s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with



                      $ sed -E 's/^(.{15}).{5}/1/' file
                    18DMA H 0.886 5.687 5.320
                    18DMA H 1.019 5.764 5.247
                    18DMA Np 0.947 5.584 5.151
                    18DMA H 1.033 5.541 5.113
                    18DMA Cn 0.880 5.674 5.050
                    18DMA H 0.831 5.616 4.971
                    18DMA H 0.814 5.751 5.091
                    18DMA H 0.957 5.735 5.003
                    18DMA Cn 0.837 5.486 5.185


                    It's using a "back reference" for the first 15 characters to restore them using 1
                    in the replacement part of the substitute command.






                    share|improve this answer


























                      5












                      5








                      5







                      Assuming you don't have <TAB>s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with



                        $ sed -E 's/^(.{15}).{5}/1/' file
                      18DMA H 0.886 5.687 5.320
                      18DMA H 1.019 5.764 5.247
                      18DMA Np 0.947 5.584 5.151
                      18DMA H 1.033 5.541 5.113
                      18DMA Cn 0.880 5.674 5.050
                      18DMA H 0.831 5.616 4.971
                      18DMA H 0.814 5.751 5.091
                      18DMA H 0.957 5.735 5.003
                      18DMA Cn 0.837 5.486 5.185


                      It's using a "back reference" for the first 15 characters to restore them using 1
                      in the replacement part of the substitute command.






                      share|improve this answer













                      Assuming you don't have <TAB>s but multiple spaces as field separators, and by looking at and counting your sample data, I came up with



                        $ sed -E 's/^(.{15}).{5}/1/' file
                      18DMA H 0.886 5.687 5.320
                      18DMA H 1.019 5.764 5.247
                      18DMA Np 0.947 5.584 5.151
                      18DMA H 1.033 5.541 5.113
                      18DMA Cn 0.880 5.674 5.050
                      18DMA H 0.831 5.616 4.971
                      18DMA H 0.814 5.751 5.091
                      18DMA H 0.957 5.735 5.003
                      18DMA Cn 0.837 5.486 5.185


                      It's using a "back reference" for the first 15 characters to restore them using 1
                      in the replacement part of the substitute command.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Dec 23 '18 at 13:07









                      RudiCRudiC

                      4,2441312




                      4,2441312























                          4














                          $ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
                          18DMA H 0.886 5.687 5.320
                          18DMA H 1.019 5.764 5.247
                          18DMA Np 0.947 5.584 5.151
                          18DMA H 1.033 5.541 5.113
                          18DMA Cn 0.880 5.674 5.050
                          18DMA H 0.831 5.616 4.971
                          18DMA H 0.814 5.751 5.091
                          18DMA H 0.957 5.735 5.003
                          18DMA Cn 0.837 5.486 5.185


                          This short awk program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.



                          If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.



                          The output will be tab-delimited (or delimited by whatever you set OFS to on the command line).






                          share|improve this answer






























                            4














                            $ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
                            18DMA H 0.886 5.687 5.320
                            18DMA H 1.019 5.764 5.247
                            18DMA Np 0.947 5.584 5.151
                            18DMA H 1.033 5.541 5.113
                            18DMA Cn 0.880 5.674 5.050
                            18DMA H 0.831 5.616 4.971
                            18DMA H 0.814 5.751 5.091
                            18DMA H 0.957 5.735 5.003
                            18DMA Cn 0.837 5.486 5.185


                            This short awk program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.



                            If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.



                            The output will be tab-delimited (or delimited by whatever you set OFS to on the command line).






                            share|improve this answer




























                              4












                              4








                              4







                              $ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
                              18DMA H 0.886 5.687 5.320
                              18DMA H 1.019 5.764 5.247
                              18DMA Np 0.947 5.584 5.151
                              18DMA H 1.033 5.541 5.113
                              18DMA Cn 0.880 5.674 5.050
                              18DMA H 0.831 5.616 4.971
                              18DMA H 0.814 5.751 5.091
                              18DMA H 0.957 5.735 5.003
                              18DMA Cn 0.837 5.486 5.185


                              This short awk program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.



                              If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.



                              The output will be tab-delimited (or delimited by whatever you set OFS to on the command line).






                              share|improve this answer















                              $ awk -v OFS='t' 'NF == 5 { sub("[0-9]*$", "", $2) } NF == 6 { $0 = $1 OFS $2 OFS $4 OFS $5 OFS $6 } { print }' file
                              18DMA H 0.886 5.687 5.320
                              18DMA H 1.019 5.764 5.247
                              18DMA Np 0.947 5.584 5.151
                              18DMA H 1.033 5.541 5.113
                              18DMA Cn 0.880 5.674 5.050
                              18DMA H 0.831 5.616 4.971
                              18DMA H 0.814 5.751 5.091
                              18DMA H 0.957 5.735 5.003
                              18DMA Cn 0.837 5.486 5.185


                              This short awk program will do different things to the input line depending on whether it contains 5 or 6 whitespace-delimited fields.



                              If it contains five fields, it removes all digits from the end of the second fields and leaves the rest as it is. If it contains six fields, it rewrites the line but omits the third field.



                              The output will be tab-delimited (or delimited by whatever you set OFS to on the command line).







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Dec 23 '18 at 13:15

























                              answered Dec 23 '18 at 13:09









                              KusalanandaKusalananda

                              128k16241398




                              128k16241398























                                  1














                                  What about using vim?



                                  vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file


                                  this regex in the vim command finds the exact pattern, deletes them and saves the file as file1 and quits vim. Your desired formatted things are now in file1.

                                  See, vim is ultimately poor man's sed,awk,perl -e 's/.../',tr,cut and many more altogether.



                                  NB: This will also work with vi. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.






                                  share|improve this answer






























                                    1














                                    What about using vim?



                                    vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file


                                    this regex in the vim command finds the exact pattern, deletes them and saves the file as file1 and quits vim. Your desired formatted things are now in file1.

                                    See, vim is ultimately poor man's sed,awk,perl -e 's/.../',tr,cut and many more altogether.



                                    NB: This will also work with vi. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.






                                    share|improve this answer




























                                      1












                                      1








                                      1







                                      What about using vim?



                                      vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file


                                      this regex in the vim command finds the exact pattern, deletes them and saves the file as file1 and quits vim. Your desired formatted things are now in file1.

                                      See, vim is ultimately poor man's sed,awk,perl -e 's/.../',tr,cut and many more altogether.



                                      NB: This will also work with vi. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.






                                      share|improve this answer















                                      What about using vim?



                                      vim +"%s/([A-Za-z])@<=s?d+//g" +"w file1" +"q!" file


                                      this regex in the vim command finds the exact pattern, deletes them and saves the file as file1 and quits vim. Your desired formatted things are now in file1.

                                      See, vim is ultimately poor man's sed,awk,perl -e 's/.../',tr,cut and many more altogether.



                                      NB: This will also work with vi. The slash before the bang ( ! ) escapes the bang. The regex is vim-flavored.







                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited Dec 24 '18 at 8:34

























                                      answered Dec 24 '18 at 8:28









                                      Ritajit KunduRitajit Kundu

                                      857




                                      857























                                          0














                                          If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:



                                          awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The $0=$0 assignment will cause awk to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.



                                          A version that will use Tab as the output field separator:



                                          awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The extra sub(OFS OFS, OFS) will collapse the empty field created by $3="". That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.






                                          share|improve this answer


























                                          • Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                            – Kusalananda
                                            Dec 24 '18 at 6:51











                                          • @Kusalananda added tab delimited version

                                            – Uncle Billy
                                            Dec 24 '18 at 7:08











                                          • For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                            – Uncle Billy
                                            Dec 24 '18 at 8:30













                                          • Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                            – Kusalananda
                                            Dec 24 '18 at 9:44











                                          • So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                            – Uncle Billy
                                            Dec 24 '18 at 10:45
















                                          0














                                          If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:



                                          awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The $0=$0 assignment will cause awk to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.



                                          A version that will use Tab as the output field separator:



                                          awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The extra sub(OFS OFS, OFS) will collapse the empty field created by $3="". That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.






                                          share|improve this answer


























                                          • Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                            – Kusalananda
                                            Dec 24 '18 at 6:51











                                          • @Kusalananda added tab delimited version

                                            – Uncle Billy
                                            Dec 24 '18 at 7:08











                                          • For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                            – Uncle Billy
                                            Dec 24 '18 at 8:30













                                          • Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                            – Kusalananda
                                            Dec 24 '18 at 9:44











                                          • So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                            – Uncle Billy
                                            Dec 24 '18 at 10:45














                                          0












                                          0








                                          0







                                          If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:



                                          awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The $0=$0 assignment will cause awk to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.



                                          A version that will use Tab as the output field separator:



                                          awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The extra sub(OFS OFS, OFS) will collapse the empty field created by $3="". That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.






                                          share|improve this answer















                                          If I were you, I would first "fix" the original, and then simply delete the column. You can do both in a single pass, though:



                                          awk '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The $0=$0 assignment will cause awk to recompute (and re-split) the current line. Unlike all other answers, this only make assumptions about the possible format of the 2nd field, not about the length or the number of the fields.



                                          A version that will use Tab as the output field separator:



                                          awk -vOFS='t' '{sub(/[0-9]+/," &",$2); $0=$0; $3=""; sub(OFS OFS,OFS); print}' input_file

                                          18DMA H 0.886 5.687 5.320
                                          18DMA H 1.019 5.764 5.247
                                          18DMA Np 0.947 5.584 5.151
                                          18DMA H 1.033 5.541 5.113
                                          18DMA Cn 0.880 5.674 5.050
                                          18DMA H 0.831 5.616 4.971
                                          18DMA H 0.814 5.751 5.091
                                          18DMA H 0.957 5.735 5.003
                                          18DMA Cn 0.837 5.486 5.185


                                          The extra sub(OFS OFS, OFS) will collapse the empty field created by $3="". That should only be necessary if the file is to be processed by a tool which is specifically expecting tab-delimited fields, or for esthetical reasons.







                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited Dec 24 '18 at 7:29

























                                          answered Dec 24 '18 at 6:45









                                          Uncle BillyUncle Billy

                                          4205




                                          4205













                                          • Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                            – Kusalananda
                                            Dec 24 '18 at 6:51











                                          • @Kusalananda added tab delimited version

                                            – Uncle Billy
                                            Dec 24 '18 at 7:08











                                          • For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                            – Uncle Billy
                                            Dec 24 '18 at 8:30













                                          • Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                            – Kusalananda
                                            Dec 24 '18 at 9:44











                                          • So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                            – Uncle Billy
                                            Dec 24 '18 at 10:45



















                                          • Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                            – Kusalananda
                                            Dec 24 '18 at 6:51











                                          • @Kusalananda added tab delimited version

                                            – Uncle Billy
                                            Dec 24 '18 at 7:08











                                          • For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                            – Uncle Billy
                                            Dec 24 '18 at 8:30













                                          • Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                            – Kusalananda
                                            Dec 24 '18 at 9:44











                                          • So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                            – Uncle Billy
                                            Dec 24 '18 at 10:45

















                                          Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                          – Kusalananda
                                          Dec 24 '18 at 6:51





                                          Although column is not deleted using this approach, just emptied. If tabs were used for field delimiter, you would still have an empty column where the column 3 data used to be.

                                          – Kusalananda
                                          Dec 24 '18 at 6:51













                                          @Kusalananda added tab delimited version

                                          – Uncle Billy
                                          Dec 24 '18 at 7:08





                                          @Kusalananda added tab delimited version

                                          – Uncle Billy
                                          Dec 24 '18 at 7:08













                                          For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                          – Uncle Billy
                                          Dec 24 '18 at 8:30







                                          For the record: the fields from the output of the 2nd version are separated by tabs, and there isn't any tab/empy column/tab or trailing tabs. It's the R-word web interface which is messing up whitespaces -- why? (fwiw, it's perfectly possible to preserve tabs in html by using a <pre>); how are people able to post makefiles and diffs here?

                                          – Uncle Billy
                                          Dec 24 '18 at 8:30















                                          Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                          – Kusalananda
                                          Dec 24 '18 at 9:44





                                          Code posted here are for illustration. One is supposed read and understand the code (Makefile or otherwise) and to know where there are tabs and where there are spaces. If it's unclear, it's helpful to point this out. Code posted here are not meant for thinkless copying and pasting (that goes for the rest of the web too, obviously).

                                          – Kusalananda
                                          Dec 24 '18 at 9:44













                                          So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                          – Uncle Billy
                                          Dec 24 '18 at 10:45





                                          So not being able to exchange accurate diff(1)s is a feature now? LOL.

                                          – Uncle Billy
                                          Dec 24 '18 at 10:45


















                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490597%2fcopy-certain-spaces-from-a-file%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          Bressuire

                                          Cabo Verde

                                          Gyllenstierna