regex - How to extract characters before a pattern -

February 15, 2011

i need on how extract specific string of line.

i have file thousands of lines this:

eukaryota; alveolata; ciliophora; intramacronucleata; paramecium# eukaryota; viridiplantae; streptophyta; embryophyta# bacteria; cyanobacteria; synechococcales; acaryochloridaceae; acaryochloris# eukaryota; viridiplantae#  bacteria; proteobacteria; alphaproteobacteria#

and obtain first , last item of each line. output be:

eukaryota; paramecium# eukaryota; embryophyta# bacteria; acaryochloris# eukaryota; viridiplantae#  bacteria; alphaproteobacteria#

i know how 1st column

awk '{print$1}' filein > fileout

but don't know how last item in different columns.

i tried adding # , keep xx characters before #

grep -e -o '.{x,x}pattern. filein > fileout

where output looks like: les; sulfolobaceae; sulfolobus# ; thermoproteaceae; caldivirga# les; haloferacaceae; haloferax# haloferacaceae; haloquadratum# ales; natrialbaceae; natrialba#

but have repeat procedure , remove ; until i'm left final item.

i've search see if there grep or awk option that, extract 1st , last column or extract characters attached # not find work me.

i appreciate suggestions on how proceed.

thanks.

$ awk 'begin{fs=ofs=";"}{print $1,$nf}' file eukaryota; paramecium# eukaryota; embryophyta# bacteria; acaryochloris# eukaryota; viridiplantae#  bacteria; alphaproteobacteria#

Search This Blog

RT

regex - How to extract characters before a pattern -

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -