regex - How to extract characters before a pattern -


i need on how extract specific string of line.

i have file thousands of lines this:

eukaryota; alveolata; ciliophora; intramacronucleata; paramecium# eukaryota; viridiplantae; streptophyta; embryophyta# bacteria; cyanobacteria; synechococcales; acaryochloridaceae; acaryochloris# eukaryota; viridiplantae#  bacteria; proteobacteria; alphaproteobacteria# 

and obtain first , last item of each line. output be:

eukaryota; paramecium# eukaryota; embryophyta# bacteria; acaryochloris# eukaryota; viridiplantae#  bacteria; alphaproteobacteria#  

i know how 1st column

awk '{print$1}' filein > fileout 

but don't know how last item in different columns.

i tried adding # , keep xx characters before #

grep -e -o '.{x,x}pattern. filein > fileout 

where output looks like: les; sulfolobaceae; sulfolobus# ; thermoproteaceae; caldivirga# les; haloferacaceae; haloferax# haloferacaceae; haloquadratum# ales; natrialbaceae; natrialba#

but have repeat procedure , remove ; until i'm left final item.

i've search see if there grep or awk option that, extract 1st , last column or extract characters attached # not find work me.

i appreciate suggestions on how proceed.

thanks.

$ awk 'begin{fs=ofs=";"}{print $1,$nf}' file eukaryota; paramecium# eukaryota; embryophyta# bacteria; acaryochloris# eukaryota; viridiplantae#  bacteria; alphaproteobacteria# 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -