www.elsevier.com/locate/gene
PhylogeneticrelationshipsoftheFox(Forkhead)genefamily
intheBilateria
FrancßoiseMazeta,Jr-KaiYub,DavidA.Liberlesc,LindaZ.Hollandb,SebastianM.Shimelda,*
aSchoolofAnimalandMicrobialSciences,TheUniversityofReading,P.O.Box228Whiteknights,ReadingRG66AJ,UK
bScrippsInstitutionofOceanography,UniversityofCaliforniaatSanDiego,LaJolla,CA,USA
cComputationalBiologyUnit,CentreforComputationalScience,UniversityofBergen,5020Bergen,Norway
Received15April2003;receivedinrevisedform28May2003;accepted29May2003
ReceivedbyR.DiLauro
Abstract
TheForkheadorFoxgenefamilyencodesputativetranscriptionfactors.ThereareatleastfourFoxgenesinyeast,16inDrosophilamelanogaster(Dm)and42inhumans.Recently,vertebrateFoxgeneshavebeenclassifiedinto17groupsnamedFoxAtoFoxQ[GenesDev.14(2000)142].Here,weextendthisanalysistoinvertebrates,usingavailablesequencesfromD.melanogaster,Anophelesgambiae(Ag),Caenorhabditiselegans(Ce),theseasquirtCionaintestinalis(Ci)andamphioxusBranchiostomafloridae(Bf),fromwhichwealsoclonedseveralFoxgenes.Phylogeneticanalyseslendsupporttothepreviousoverallsubclassificationofvertebrategenes,butsuggestthatfoursubclasses(FoxJ,L,NandQ)couldbefurthersubdividedtoreflecttheirrelationshipstoinvertebrategenes.WewereunabletoidentifyorthologsofFoxsubclassesE,H,I,J,MandQ1inD.melanogaster,A.gambiaeorC.elegans,suggestingeitherconsiderablelossinecdysozoansortheevolutionofthesesubclassesinthedeuterostomelineage.Ouranalysessuggestthatthecommonancestorofprotostomesanddeuterostomeshadaminimumcomplementof14Foxgenes.D2003ElsevierB.V.Allrightsreserved.
Keywords:Phylogeny;Evolution;Amphioxus;Ciona
1.Introduction
TheFoxgenefamilyencodestranscriptionfactorscon-taininganapproximately100aminoacidDNA-bindingdomainknownastheforkheaddomain(WeigalandJa¨ckle,1990;reviewedbyGranadinoetal.,2000).Thisdomainformsastructureknownasawingedhelix,suggestingtheFoxgenesarerelatedtothelargeassemblageofgenescontainingwingedhelixdomainsthatoccurinbotheukary-
Abbreviations:Ag,Anophelesgambiae;Bf,Branchiostomafloridae;BLAST,BasicLocalAlignmentSearchTool;Ce,Caenorhabditiselegans;Ci,Cionaintestinalis;Dm,Drosophilamelanogaster;Dr,Daniorerio;FKHRP1,ForkHeadRelatedPseudogene1;Fr,Fugurubripes;FREAC,ForkHeadRelatedActivator;Hs,Homosapiens;slp,sloppypaired;SMART,SimpleModularArchitectureResearchTool;Xl,Xenopuslaevis.
*Correspondingauthor.Tel.:+44-118-9875123x7084;fax:+44-118-9310180.
E-mailaddress:s.m.shimeld@reading.ac.uk(S.M.Shimeld).0378-1119/$-seefrontmatterD2003ElsevierB.V.Allrightsreserved.doi:10.1016/S0378-1119(03)00741-8
oteandprokaryotegenomes(Clarketal.,1993).Amongstthisassemblage,however,theFoxgenesformadiscretegrouprecognisableonthebasisofthehighdegreeofaminoacidsequenceidentityoftheirforkheaddomains(GajiwalaandBurley,2000).Foxgeneshavebeenidentifiedinmanyanimalsaswellasinyeastandotherfungi,butnotinplants.ThissuggestsanevolutionaryoriginoftheFoxgenefamilyinacladeofunicellularorganismsthatgaverisetoboththefungalandanimallineages(Baldauf,1999).Infungi,fourFoxgeneshavebeenidentified,whilecomparisonofmammaliangenomesshowsatleast34distinctorthologs,suggestingamajorexpansioninthecomplexityofthisfamilyduringanimalevolution.
AnearlyphylogeneticanalysisshowedthattheFoxgenescanbesubgroupedinseveralsubclasses(KaufmannandKnochel,1996).Subsequently,ananalysisincludingover130vertebrategenesand3invertebrategenesextendedthisclassification,defining15subclassesofFoxgenesnamedfromAtoO(Kaestneretal.,2000).Eachsubclasswaswellsupportedbybootstrapanalysis,butmostofthe
80F.Mazetetal./Gene316(2003)79–89
deeperrelationshipsbetweensubclasseswereunclear.Morerecently,twomoresubclasses,PandQ,weredefinedforatotalof17(Hongetal.,2001;Shuetal.,2001;Schubertetal.,2001).
Foxgeneshavealsobeenfoundinseveralinvertebratetaxa;however,asystematicapproachtotheirclassificationisnotyetavailable.Correspondingly,wehaveundertakenamolecularphylogeneticanalysisofMetazoanFoxgenes,includingFoxsequencesfromthebasalchordatesCionaintestinalis(Ci)(anascidian)andBranchiostomafloridae(Bf)(amphioxus)aswellasfromvertebratesandfromthreeprotostometaxawithsequencedgenomes(Drosophilamel-anogaster[Dm],Anophelesgambiae[Ag]andCaenorhab-ditiselegans[Ce]).OurresultsshowthatmostinvertebrateFoxgenescanbeascribedasorthologoustospecificcladesofvertebrategenes,andthatthesecladestypicallycorre-spondtothepreviouslydefinedvertebratesubclasses.AfewinvertebrateFoxgenesdonotshowarobustrelationshipwithanyspecificcladeofvertebrategenes,andsomecladesofchordategenesdonothaveprotostomeorthologs.Over-all,weidentify14cladesofFoxgeneswithmembersinbothchordatesandecdysozoans,settingthisasthemini-mumFoxgenecomplementofthecommonancestorofprotostomesanddeuterostomes.Wealsoexaminetheevi-dencefortheroleofselectioninFoxsubclassdiversificationandidentifycandidatesitesunderlyingtheadaptiveradia-tionofFoxgenesubclasses.
2.Materialsandmethods2.1.Acquisitionofsequences
Sequenceswereacquiredfromthefollowingdatabases:D.melanogaster:http://flybase.bio.indiana.edu/(17sequen-ces);C.intestinalis:http://ghost.zool.kyoto-u.ac.jp/(forcDNAsequences)andhttp://www.jgi.doe.gov/(forgenomicsequences)(11sequencesintotal).Wenote,though,thatourlistofC.intestinalissequencesisunlikelytobeexhaustive,andthatmoreFoxgenesarelikelytobepresentinthistaxonasrevealedbytheongoinggenomeassemblyprocess(N.Satoh,personalcommunication).Fugurubripes:http://www.jgi.doe.gov/andhttp://www.hgmp.mrc.ac.uk(release1)(24sequences).C.elegans(14sequences),Daniorerio(Dr),Xenopuslaevis(Xl)(16sequences),Homosapiens(Hs)(36sequences)andyeast(4sequences):http://www.ncbi.nlm.nih.gov/.A.gambiae:http://www.ensembl.org/Anopheles_gambiae/(20sequences).InitialsearchesofsequenceannotationbykeywordweresupplementedwithBLASTXandBLASTPsearchestoidentifyincorrectlyannotatedorunannotatedsequences.WealsoconsultedtheSimpleModularArchitectureResearchTool(SMART)database(Schultzetal.,2000)toensurethemethodsusedhereforproteindomainidentificationhadnotidentifiedadditionalsequencesthatoursearcheshadmissed.Com-parisonofF.rubripesandD.reriogenes(notshown)
revealedthatmostD.reriogeneshadapparentorthologsinF.rubripes.Similarly,allthemurineFoxgeneswefoundwereorthologoustoH.sapiensgenes.ThereforeweomittedD.rerioandmouseFoxgenesfromouranalysis,astheywouldnotaddfurtherinsightintorelationshipsbetweeninvertebrateandvertebrategenes.TheexceptiontothisistheD.reriogeneDrFoxH,whichisincludedintheanalyses,aswecouldnotfindaF.rubripes(Fr)FoxHortholog.SeveralamphioxusFoxgeneshavebeenpreviouslyidenti-fied(Shimeld,1997a;Toressonetal.,1998;Schlakeetal.,2000;Yasuietal.,2001;MazetandShimeld,2002;Yuetal.,2002a,b,2003).Tothesesequences,weaddedseveralthathavebeenrecentlygeneratedinourrespectivelaboratories,giving11intotal.Fullcharacterisationoftheseadditionalgeneswillbepublishedelsewhere.2.2.Phylogeneticanalyses
FoxgenesfromalltaxawerealignedwithCLUSTALX(Thompsonetal.,1997)toidentifytheforkheaddomainandanyadditionalconserveddomainswhichmightbepresent.Theforkheaddomainprovedtobetheonlyconsistentlyalignablesectionofthesequencesinthedataset;hence,thesewerecollatedandrealignedforphylogeneticanalysis.Besidestheforkheaddomain,someFoxproteinsalsohaveanotherconservedpeptidemotifsimilartotheeh-1domainencodedbyseveralhomeoboxgenes(SmithandJaynes,1996;Shimeld,1997b).However,therelativepositionofthiseh-1domaintotheforkheaddomainisvariableandtheconsensusweak(datanotshown).Withoutfunctionalanal-ysis,itisdifficulttodetermineifthisrepresentsdivergenceofaprimitivefunctionaldomain,orconvergentevolutionofsimilarsequencewithorwithoutafunctionalrole.Conse-quently,wedonotbelievethatthiseh-1-likedomainisareliablephylogeneticcharacter,andwehavenotuseditinouranalyses.
PreliminaryphylogeneticanalysisoflargedatasetswasconductedusingNeighborJoiningimplementedbyCLUS-TALX.Incompletesequences,probablepseudogenes(e.g.,ForkHeadRelatedPseudogene1[FKHRP1]fromH.sapiens)andhighlydivergentsequenceswithanincompleteforkheaddomain(e.g.,crg-1fromD.melanogaster)werenotincludedinthephylogeny,sincepreliminarysequencecomparisonssuggestedthattheseresultedfromrelativelyrecentlineage-specificevolutionaryeventsandarethereforenotrelevanttodeeperphylogeneticrelationships.PairwiseKa/KsratioswerecalculatedusingboththeNG(NeiandGojobori,1986)andtheYN(YangandNielsen,2000)methods.Covarion-basedrelativeratetestswereperformedonproteinsequencestodetectchangesintheselectivepressuresactingonthesegenes(Gu,1999;GuandVanderVelden,2002).Siteswereidentifiedforsubclass-specificevolutioniftheyshowedposteriorprobabilitiesgreaterthan0.5formorethanhalfofthesignificantdivergencesfortherespectiveclass.TreeswereviewedandmanipulatedwithTreeViewPPC(Page,1996).
F.Mazetetal./Gene316(2003)79–8981
3.Resultsanddiscussion
3.1.MolecularphylogeneticanalysisoftheFoxgenesToexaminetherelationshipsbetweeninvertebrateFoxgenesandthevertebrateFoxgenesubclasses,weunder-tookamolecularphylogeneticanalysisoftheFoxfamily.Initially,weincludedalltheavailableFoxsequencesfromC.intestinalis,B.floridae,C.elegans,D.mela-nogaster,A.gambiae,H.sapiens,F.rubripesandX.laevis,exceptforasmallnumberofhighlyderivedgenes(seeMaterialsandmethods).WealsoincludedthefourFoxgenesfromyeast,whichformtwopairsofrelatedgenes.WefoundnoevidencethateitherpairismorecloselyrelatedtoonesubsetofanimalFoxgenesthananyother(notshown).Consequently,theyeastgenesarelikelytobeequallyrelatedtoalloftheanimalgenes.Infurtheranalyses,weexcludedthefungalFoxgenes,duetotheirhighlevelofdivergencefromtheanimalFoxgenes.
ToexaminetherelationshipofbasalchordateFoxgeneswithputativevertebrateorthologs,weanalysedvertebrate,C.intestinalisandB.floridaesequences.WealsoincludedD.melanogastersequences,togiveapreliminaryindicationofwhethercladesofchordategenescouldbeextendedtoincludeecdysozoanorthologs.TheresultsofthisanalysisareshowninFig.1.Thisanalysis,likepreviousones(Kaestneretal.,2000),showedthatmostvertebrategenesfellintowell-definedcladessupportedbyhighbootstrapvalues.ThisincludedallF.rubripes,D.rerioandX.laevisgenes,whichwereallcloselyrelatedtoatleastoneH.sapiensgene(Fig.1anddatanotshown).ThissuggestsadditionalFoxsub-classesdonotremaintobediscoveredinhumans.MostFoxgenesofbothbasalchordatesandD.melanogasterwereassociatedwithspecificvertebratecladeswithreasonablebootstrapsupport(discussedindetailinSec-tions3.3–3.13).
ToexaminetherelationshipsofallprotostomeFoxgeneswithchordateFoxgenes,weconstructedastrippeddatasetofhumangenes,includingonehumanmemberofeachofthecladesidentifiedinFig.1.Wheremultiplehumangeneswerepresentinaclade,thegeneusedwasselectedonthebasisofitssubclassnumber(thatis,weusedthegenenamedas1,HsFoxD1,HsFoxE1,etc.)asthephylogenyindicatedallmembersofthecladewereequallyvalidchoicesforsuchananalysis.Wealsoincludedall‘orphan’chordategeneswhichcouldnotbeascribedtoaspecificclade.WealsoincludedallC.elegans,A.gambiaeandD.melanogasterFoxgenes(withtheexceptionthatweincludedonlyoneofthegenepairsslp1/slp2andfd4/fd5,asthesehavebeenpreviouslyshowntoresultfromrelativelyrecenttandemduplications).Thisdatasetallowsrelationshipsbetweeninvertebrateandvertebrategenestobeclearlyvisualised.TheresultsoftheanalysisareshowninFig.2.
3.2.ClassificationofvertebrateandinvertebrateFoxgenesVertebrategeneshavebeenpreviouslycategorisedintosubclassesonthebasisofamolecularphylogeneticanalysis.Essentially,eachsubclasswasamonophyleticgroupofchordategenesseparatedfromotherFoxgenesbyhighbootstrapvalues.Mostofthesesubclassescontainedmul-tipleFoxgenesfromseveraldifferentspecies,representingamixtureoforthologousandparalagousrelationships.Weexaminedourtreestoseeifecdysozoanandbasalchordategenescouldbeascribedtoindividualvertebratesubclasses.Ifbasalchordategenescanbeascribedasorthologoustoavertebratesubclass,wecanconcludethesubclassevolvedasanindependentcladeofgenespriortotheradiationofcrownvertebratesandwasprimitivelypresentinallverte-brates.Ifanecdysozoangenecanbeascribedasortholo-goustoavertebratesubclass,wecanconcludethesubclassevolvedasanindependentcladeofgenespriortotheradiationofcrownbilateriansandwasprimitivelypresentinallbilateriananimals.
Formostofvertebratesubclasses,suchrelationshipswerediscernable.Insomeinstances,multipleinvertebrategeneswerefoundtobeorthologoustospecificcladesofvertebrategeneswithinasubclass.Intheseinstances,wesuggestsubdividingthesubclasstoindicatetheserelation-ships.Finally,severalvertebratesubclassesdidnothaveidentifiableorthologsinsomeorallofthetaxaexamined,andonerobustcladeofinvertebrategenesdidnothaveanorthologinvertebrates.Intotal,weidentified14cladesofgeneswithrepresentativesinchordatesandecdysozoans,definingthisastheminimumcomplementofFoxgenesinthecommonancestoroftheBilateria.Adetailedclade-by-cladedescriptionofourfindingsfollowsinSections3.3–3.13.Thenumbersinbracketsrepresentpercentageboot-strapsupportfromFigs.1and2,respectively.Ifonlyasinglevalueisgiven,thisisfromFig.1,unlessotherwisestated.TheseresultsaresummarisedinFig.3.3.3.FoxAtoFoxG
Thegenesinthesesevensubclasseshavebeencompar-ativelywellstudied,andmostoftherelationshipswedetectedinphylogeneticanalyseshavebeenpreviouslydeterminedonthebasisofdirectsequencecomparisonsorsmallerphylogeneticanalyses.WenotedmonophyleticgroupingsofFoxA(99%/100%),FoxB(87%/97%),FoxC(100%/100%),FoxD(99%/100%),FoxF(99%/98%)andFoxG(90%/91%),eachincludingatleastonebasalchordateandecdysozoanortholog.VertebrateFoxEgeneswerealsomonophyleticinFig.2(91%).NoecdysozoanFoxEortho-logswereidentified.3.4.FoxHandFoxI
PutativechordateFoxHgenesgroupedwithstrongsupport(97%).However,noecdysozoanFoxHcandidates
82F.Mazetetal./Gene316(2003)79–89
F.Mazetetal./Gene316(2003)79–8983
Fig.2.UnrootedneighborjoiningtreecontainingsequencesfromC.elegans(Ce),D.melanogaster(Dm),A.gambiae(Ag)andH.sapiens(Hs).Onlyonegeneofeachsubclasshasbeenselectedforhumangenes.Subclasseswithgenespresentinvertebratesplusatleastoneecdysozoanareshaded,togetherwiththeassociatedbootstrapsupportvalue.Numbersrepresentpercentagebootstrapsupportfortheadjacentnode,andvalueslowerthan60%arenotshown.H.sapiens,C.elegansandD.melanogastergenesarereferredtobygenename;foralistofaccessionnumbers,pleaseseeTable1.A.gambiaegenesarereferredtobyanabbreviatedgenename.Foralistoffullgenereferencenumbersforthesegenesanddetailsonhowtoaccessthesequences,pleaserefertoTable1.
wereidentified.Similarly,chordateFoxIgenesgroupedstrongly(100%),butecdysozoanFoxIgeneswerenotidentified.3.5.FoxJ
ouramphioxussequencegroupedwithFOXJ1(100%),whiletheC.intestinalissequencegroupedwithFOXJ2(99%).Thesetwocladesgroupconsistentlyassistercladesinourphylogenies,butwithlimitedsupport.3.6.FoxK
NoputativeecdysozoanFoxJgeneswereidentified.Interestingly,ourphylogeniessuggestedtheFoxJcladeincludedtwogenesbeforetheradiationofchordates,since
OurphylogeneticanalysisgavestrongsupporttothesubclassificationofD.melanogasterLD13167andA.gam-
Fig.1.NeighborjoiningtreecontainingsequencesfromB.floridae(amphi),C.intestinalis(CiorGCi),D.melanosgaster(Dm),F.rubripes(Fr),H.sapiens(Hs),X.laevis(Xl).ThesingleD.reriosequenceisindicatedbytheprefixDr.Numbersrepresentpercentagebootstrapsupportfortheadjacentnode.Onlybootstrapsupportvaluesgreaterthan60%areshown.Shadedboxesindicateindividualsubclassesandthebootstrapvaluessupportingtheirdefinition.Thiswasanunrootedanalysis;however,forclarity,thedepictedtreehasbeendrawnasrootedwiththeFoxPsubclass.
84F.Mazetetal./Gene316(2003)79–89
Fig.3.Foxgenedistributionbysubclassinthetaxaanalysedaspartofthisstudy.Filledboxesindicatethatamemberofthatsubclasshasbeendetectedintheindicatedspecies.Forsimplicity,onlyhumangenesareshownasrepresentativeofvertebrates.NotetheabsenceinallthreeecdysozoanlineagesofFoxE,H,I,J,MandQ1orthologs.
biaeAg16550asFoxKorthologs(100%/96%).Ag3502isalsolikelytobeamemberofthissubclass(82%inFig.2),However,wedidnotidentifyaFoxKgeneinanyotherinvertebratetaxon.3.7.FoxL
VertebrateFoxL2groupswithA.gambiaeAg59(99%inFig.2)andtheC.intestinalislocusWno743d17(100%).Amongstthechordates,FoxL1orthologswereidentifiedonlyinvertebrates(99%).TheA.gambiaesequenceAg446groupswithFoxL1inFig.2withmodestsupport.ExaminationofsequencealignmentsshowsthatAg446isorthologoustoD.melanogasterfd2,asbothgenesshareauniquederivedindelintheforkheaddomain(notshown).WethereforeconcludeAg446andfd2areprobablyorthologoustoFoxL1.Consequently,wesuggestsplittingtheFoxLsubclasstoreflecttheobser-vationthatbothFoxL1andFoxL2haveorthologsinecdysozoans.3.8.FoxM
OuranalysesonlyidentifiedaFOXMsequenceinthehumangenome.However,wenoteFoxMgeneshavebeenisolatedfromothervertebrates(forexample,mouseandrat,accessionnumbersO08696andP97691,respectively,andthechickenEST603956051F1availablefromhttp://www.chick.umist.ac.uk/).3.9.FoxN
TheFoxNgenesresolvedintotwoclades,onewithFOXN1andFOXN4relatedtoA.gambiaeAg2273andAg2274,D.melanogasterjumuandamphioxusBfFoxN1(100%/100%),andasecondwithFOXN2andFOXN3
relatedtoD.melanogasterches-1andA.gambiaeAg8820(100%/100%).Consequently,wesuggestsplittingtheFoxNsubclasstoreflectthisseparation.NoC.elegansgeneswereassociatedwiththeseclades.AsingleX.laevisandasingleF.rubripesFoxNgenefelloutsidetheseclades(Fig.1);however,withoutmoresequencesinthissubclass,itisdifficulttointerpretthisresult.ThetwoFoxNcladesconsistentlyappearedassistercladesinourphylogenies(87%inFig.2).3.10.FoxO
TheC.elegansgenedaf-16(whichincludestwoseparateforkheaddomains;Oggetal.,1997)andthreeverysimilarA.gambiaegenes(Ag11033,Ag11036,Ag6490)andD.melanogasterQ95V55(cg3143)groupedrobustlywithvertebrateFoxOgenes(100%/99%).3.11.FoxP
TheD.melanogastersequencecg16899,theA.gambiaesequenceAg14855andtheC.eleganssequenceF26D12.1groupedrobustlywiththevertebrateFoxPgenes(100%/100%);however,wedidnotidentifyaFoxPorthologinotherinvertebrates.3.12.FoxQ
TheprototypevertebrateFoxQsubclassgenewasthehumangeneHFH11.Inadditiontothis,wehaveidentifieddefinitiveFoxQgenesinF.rubripesandC.intestinalis(100%).Noecdysozoangenesgrouprobustlywiththisclade.However,asecondcladeofgenesappearsinourtreesthatincludesD.melanogaster,A.gambiaeandC.elegansgenes(100%inFig.2)andC.intestinalisGCiWno740j23(66%)andB.floridaeAmphiFoxQ2
F.Mazetetal./Gene316(2003)79–89
Table1
SubclassmembershipforinvertebrateandH.sapiensFoxgenesSubclassFoxA
SpeciesH.sapiensH.sapiensH.sapiens
D.melanogasterA.gambiaeaC.elegansC.intestinalisB.floridaeB.floridaeH.sapiens
D.melanogasterD.melanogasterA.gambiaeA.gambiaeC.elegansC.intestinalisB.floridaeH.sapiensH.sapiens
D.melanogasterA.gambiaeB.floridaeH.sapiensH.sapiensH.sapiensH.sapiens
D.melanogasterA.gambiaeC.elegansC.intestinalisB.floridaeH.sapiensH.sapiensH.sapiensC.intestinalisB.floridaeB.floridaeH.sapiensH.sapiens
D.melanogasterA.gambiaeC.elegansC.intestinalisB.floridaeH.sapiens
D.melanogasterD.melanogasterD.melanogasterC.elegansA.gambiaeB.floridaeH.sapiensC.intestinalisH.sapiensC.intestinalisH.sapiensB.floridaeH.sapiensC.intestinalisH.sapiens
D.melanogaster
Currentname(s)
FOXA1(HNF-3a)FOXA2(HNF-3h)FOXA3(HNF-3g)forkhead,fkhAg19661Ce-fkh1CiFoxA5AmHNF-3-1AmHNF-3-2FOXB1fd4fd5Ag394Ag2511lin-31
GciWno407-j18AmphiFoxB
FOXC1(MF1,FKHL7)FOXC2(MFH1)fd1
Ag11032AmphiFoxC
FOXD1(FREAC4)FOXD2(FREAC9)FOXD3
FOXD4(FREAK5)fd3Ag8042unc-130
CiFoxD,GciWno586o10AmphiFoxDFOXE1(TITF2)FOXE2FOXE3
GciWno830m18AmphiFoxE4AmphiFoxE5FOXF1FOXF2
biniou,DMFoxFAg11090F26B1.7CiFoxFAmphiFoxF
FOXG1(BF1,HBF2)slp1slp2cg9571
CeT14G12.4Ag351
AmphiBF-1
FOXH1(FAST1)GciWno152a23
FOXI1(FREAC6,HFH3)GciWno743d17FOXJ1AmphiFoxJFOXJ2Cieg45o16
FOXK1a-c(ILF1-3)LD16137
85
AccessionnumberorsourceofsequenceaU39840NM_021784L12141J03177
ENSANGG00000019661U51163AF002988X96519Y09236AF071554P32028P32029
ENSANGG00000020221ENSANGG00000002511L11148
www.jgi.doe.govAJ506162AF048693Y08223P32027
ENSANGG00000011032FMandSMS,unpublishedU59831AF042832L12142U13223Q02360
ENSANGG00000008042NM_064010
www.ghost.zool.kyoto-u.jpAAN03853U89995X94553U42990
www.ghost.zool.kyoto-u.jpAAK85731
FMandSMS,unpublishedU13219U13220AAK97051
ENSANGG00000011090AAB37792
FMandSMS,unpublishedFMandSMS,unpublishedX74143,X74142,X74144P32030P32031
www.flybase.bio.indiana.eduAAA82436
ENSANGT00000024640AF067203AF076292
www.ghost.zool.kyoto-u.jpU13224
www.jgi.doe.govU69537
JK-YandLZH,unpublishedAF155132
www.ghost.zool.kyoto-u.jpU58196
www.flybase.bio.indiana.edu
(continuedonnextpage)
FoxB
FoxC
FoxD
FoxE
FoxF
FoxG
FoxHFoxIFoxJ1FoxJ2FoxK
86
Table1(continued)SubclassFoxKFoxL1
SpeciesA.gambiaeA.gambiaeH.sapiens
D.melanogasterA.gambiaeH.sapiensA.gambiaeC.intestinalisH.sapiensH.sapiensH.sapiens
D.melanogasterA.gambiaeA.gambiaeB.floridaeH.sapiensH.sapiens
D.melanogasterA.gambiaeH.sapiensH.sapiensA.gambiaeA.gambiaeA.gambiaeC.elegansH.sapiensH.sapiensH.sapiens
D.melanogasterA.gambiaeC.elegansH.sapiensC.intestinalisD.melanogasterA.gambiaeC.elegansC.intestinalisB.floridaeH.sapiens
D.melanogasterA.gambiaeA.gambiaeC.elegansC.elegansC.elegansC.elegansC.elegans
F.Mazetetal./Gene316(2003)79–89
Currentname(s)
Ag3502Ag16550FOXL1fd2
Ag17877FOXL2Ag10771
GciWno608e22FOXM1
FOXN1(WHN)FOXN4jumuAg2273Ag2274BfFoxN1
FOXN2(HTLF)FOXN3(CHES1)ches-1Ag8820
FOXO1a,b(FKHR,FKHRP1)FOXO3a,b(FKHRL1,FKHRL1P)Ag11033Ag11036Ag6490daf-16FOXP1FOXP2FOXP3cg16899Ag14855F26D12.1
FOXQ(HFH11)Ciad036o10cg11152Ag410
C25A1.2(FKH-10)CiGCiWno740j23AmphiFoxQ2FREAC10cg32006Ag16120Ag14807PES-1
B0286.5(FKH-6)F40H3.4
C29F7.4(FKH-3)K03C7.2
AccessionnumberorsourceofsequenceaENSANGG00000003502ENSANGG00000016550U13225Q02360
ENSANGG00000017877AF301906
ENSANGG00000010711www.jgi.doe.govU74612Y11746AF425596NM_079578
ENSANGG00000002273ENSANGG00000002274JK-YandLZH,unpublishedU57029U68723AJ252199
ENSANGG00000008820U02310
NM_001455,AF032887ENSANGG00000011033ENSANGG00000011036ENSANGG00000006490AF020342AF250920AAL10762Q9BZS1AAF54432
ENSANGG00000014855NP_500833AF153341
www.ghost.zool.kyoto-u.jpAAM18014
AAAB01008846_410CAB02761
www.jgi.doe.govAY163864AF042831
www.flybase.bio.indiana.eduENSANGG00000016120ENSANGG00000014807Q27253AAA80692AAC67430CAB07324NP_508664
FoxL2
FoxMFoxN1/4
FoxN2/3
FoxO
FoxP
FoxQ1FoxQ2
Orphans
A.gambiaegenesarerepresentedbyENSEMBLreferencenumbers,excepttheFoxQ2genewhichisnotannotatedandrepresentedbyitsscaffoldcoordinates.Bothcanbeaccessedusingthesearchfacilityatwww.ensembl.org/Anopheles_gambiae.
a(92%).Sincegenesfromoneorganism(C.intestinalis)arepresentinbothclades,thesimplestinterpretationistosplittheFoxQgenesintotwosubclasses.Wepropose(assuggestedbyYuetal.,2003)thatthesebenamedFoxQ1andFoxQ2respectively,asthesenamesareconsistentwithnomenclatureinpublishedliterature.Insummary,FoxQ1geneshavebeenidentifiedinchordatesbutnotothertaxa,whileFoxQ2geneshavebeenidentifiedinecdysozoansandbasalchordates,butnotvertebrates.
3.13.Orphangenes
Inadditiontotheseclades,therewereseveralgeneswhoserelationshipstovertebratesubclasseswecouldnotresolve.ThesearetheC.eleganssequencesPES-1,B0286.5,F40H3.4,C29F7.4andK03C7.2,thehumangeneFREAC10andtheA.gambiaesequenceAg14807.Further-more,theD.melanogastersequencecg320006andA.gambiaesequenceAg16120areorthologs(99%inFig.2),
F.Mazetetal./Gene316(2003)79–8987
butwedidnotdetectsequencesorthologoustotheseinanyothertaxon.Intotal,however,theseorphangenesmakeupasmallproportionoftheFoxgenecomplementofeachtaxon.
Insummary,ourphylogeneticanalysisallowedustoascribemostinvertebrateFoxgenestoaspecificvertebratesubclasswithreasonablebootstrapsupport.Thesesub-classesrepresent‘orthologygroupings’suchthatweinferallthegenesinasubclassaredescendedfromasinglegeneinthecommonancestorofthetaxainwhichthatsubclasshasbeenidentified.AschematicrepresentationofsubclassdistributioninthetaxaexaminedisshowninFig.3.Table1liststhehumanandinvertebrategeneswehaveassignedtoeachsubclass,coupledwithrelevantaccessionnumbers.
3.14.Subclassespresentinchordatesbutmissinginothertaxa
Asdiscussedabove,wehavenotfoundmembersinothertaxaofseveralofthesubclassesdefinedinvertebrates.Currently,therearenoecdysozoangenesinthesubclassesE,H,I,J,MandQ1.Severalpossibilitiesmightexplainthis:
1.Thedatamaybemissing.Weconsiderthisunlikelyintaxaforwhichcompleteornear-completegenomesequencesandadvancedassembliesareavailable.
2.Thesubclassesmayhavebeenlostbyindividuallineages.3.Thegeneswehavenotbeenabletoassigntospecificsubclassesmaybemissingorthologsnowunrecognisableduetotheirlevelofdivergence.
4.Somesubclassesmayhaveevolvedspecificallyinthechordatelineage.Newmembersofgenefamiliescanevolvebyduplication,andunderthisscenario,duplica-tionwouldhavebeenfollowedbyrapiddivergenceofonegene,obscuringtheoriginalrelationshipandpullingthatgene(andthesubclassthatevolvedfromit)basallyinphylogeneticanalyses.Toinvestigatethispossibilitymoreclosely,weexaminedourdatasetsforevidenceofadaptivesequencechangeinspecificlineages.3.15.EvidencefortheactionofselectioninFoxgeneevolution
First,wetestedforevidenceofadaptiveevolutioninoursequencedatasets.UsingthepairwiseNei–Gojobori(NeiandGojobori,1986)methodtocalculatetheratioofnonsynonymoustosynonymousnucleotidesubstitutionrates(Ka/Ks)betweengenesinthedataset,wefoundthatmanyofthepairwisecomparisonsnotsurprisinglyshowedsaturationofKs.Ofthosecloserrelationshipsthatdidnot,therewasevidenceforasmallnumberofeventsofpositiveselectionorreducedfunctionalconstraintsanalysisofthefoursuchinstancesinthefourhumangenes(D2/A2;E1/F2;N4/E1;N4/E2)didnotindicate
phylogeneticsupportforrecentgeneduplicationsandmayindicateevidenceforKshavingbeenunderestimatedratherthantruepositiveselection.KsunderestimationcanbecausedbynonrandomGCusage,codonbias,orothersimilargenome-levelevents(Liberles,2001).RepeatingtheanalysiswiththemaximumlikelihoodapproachofYangandNielsen(2000)didnotsupportpositiveselectivepressuresinthesecases,butdidforothercomparisonsinvolvingsubclassP,alsoatveryhighKs.
WhileDNA-basedmethodstodetectpositiveselectivepressuresaresubjecttosaturationandgenomicphenomena,protein-basedmethodscanbetterbeusedtodetectselectivepressuresleadingtowardsalteredproteinfunctionatlongevolutionarydistances.Thesemethodshavepreviouslybeenimplementedtostudyfunctionaldivergenceingenefamiliesofelongationfactorsandofleptin(Gaucheretal.,2001;SiltbergandLiberles,2002).
Onemethodforthisanalysis,Diverge,looksforsite-specificshiftsinsubstitutionratesbetweencladeswheredifferentfunctionsareidentified,usingarelativeratetest(Gu,1999;GuandVanderVelden,2002).GroupswithatleastfoursequencesfromsubclassesP,O,N1/4,N2/3,F,Q1,I,D,E,A,B,CandGwereselectedfromthetreeinFig.1.EvidenceforsignificantshiftsinsubstitutionpatternsisseenbetweenmanysubclassesinTable2a.Aminoacidpositionswithposteriorprobabilitiesgreaterthan0.5inmorethanhalfofthesignificantcomparisonsforthatsubclassareindicatedinTable2b.Thosewithposteriorprobabilitiesgreaterthan0.5inallsignificantcomparisonsareindicatedinbold.Suchresiduesarecandidatesforhavingfunctionalrolesintheselectivedivergenceofsubclassesandmayrepresenttheresiduesresponsibleforsubclass-specificactivity.InTable2b,residuesmarkedinitalicsappeartobefacingtheDNAbindingsurfaceinthecrystalstructureofGajiwalaetal.(2000)(PDBentry1DP7)andarecandidatesforselectivedivergenceinDNAbindingspecificity.Position30insubclassEismarkedinbothboldanditalicsandwouldbeaparticularlystrongcandidateforfunctionalshifts.ThispositionisalsoshiftinginsubclassesP,O,N1/4,N2/3,A,andC.ThepositionisvariablebetweenSerandAlainsubclassE,anddifferentsubclassesuseSer,Ala,Thr,Asn,Cys,orarevariablebetweentheseaminoacids,mostlysmallpolaraminoacidswithdifferenthydrogenbondingfunctionality.Covarionprocessesmaybeinoperation,asotheraminoacidpositionsshowsimilarpatternsofvari-ation.UnravelingtheroleofindividualandcoordinatedsubstitutionsafterasdeterminantsoffunctionaftergeneduplicationintheFox(Forkhead)genefamilyremainsanimportantchallenge.
4.Conclusions
4.1ThemajorityofbilaterianFoxgenesshowclear,
orthologousrelationshipswithFoxgenesinothertaxa.
88F.Mazetetal./Gene316(2003)79–8952.0F90C.09812..00FF7932B..00398311...000FFF377481A...00068241202....0000FFFF77085604E....00001351533331.....00000FFFFF8928627521D.....00000623166203221......000000FFFFFF804534300351......I00000032268723022240.......0000000FFFFFFF008492018016200Q.......0000000se10714855ti22132122s........00000000taFFFFFFFFse86104056t24323542a........rF00000000noituti409146026t322222202s.........b000000000usFFFF.dg3FFFFFl/844858603on2226742402bitfN.........000000000niihsdefto2714358864ae5232131122cic..........n0000000000dnedi4FFFFFFFFFFiv/2670095124ere17759106635arN..........0000000000oroefzsess57858925802moa24233222132r...........flc00000000000tbuFFFFFFFFFFFnes12254086047r67037971062enfO...........fe00000000100iedwytletbn732452480171ad230233212222ce............i000000000000friaFFFFFFFFFFFFnpgmi350506557350so050713231517c............sP000000000000nsoaswiraÃpM43//121mQoONNFQIDEABCGCTable2b
Residueswithposteriorprobabilitiesgreaterthan0.5ingreaterthanhalfofthesubclassspecificcomparisonsareindicatedSubclassResiduesP14,23,25,30,31,34,39,40,45,53,60,61,74,76,77,79,81
O
12,13,14,19,20,21,22,23,24,25,26,27,28,30,31,34,36,37,38,39,40,41,42,44,45,46,47,48,49,50,51,52,53,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,102,103
N1/4
12,13,14,19,20,22,23,24,25,26,27,28,30,31,34,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,
74,
75,
76,
77,
78,
79,
80,
81,
82,102,103
N2/314,21,22,25,28,30,34,36,39,43,44,45,46,47,48,49,50,51,53,59,60,61,62,76,81,82,102
F27,36,46,51Q125,34,48,77I14,21,24,25,26,27,31,34,36,40,44,45,46,47,48,51,53,59,76,77,80,81,103
D21,24,31
E14,21,26,28,30,31,36,44,45,46,49,52,53,60,61,62,82,103A
12,13,14,19,20,21,22,23,24,25,26,27,28,30,36,37,38,39,40,41,42,44,45,47,49,51,52,53,59,61,63,64,65,66,67,68,69,70,71,72,73,74,
75,76,77,78,80,81,102,103B31,39,40,46,51,53,59,61,76C26,27,28,30,31,34,
36,39,44,
46,47,51,53,61,
79,82
G
28,
31,40,60
Thosewithposteriorprobabilitiesgreaterthan0.5inallcomparisonsareindicatedinbold,whilethosefacingtheDNAbindinginterfaceinthecrystalstructureofGajiwalaetal.(2000)areindicatedinitalics.
4.2FourteencladesofFoxgeneshavemembersinboth
chordatesandecdysozoansandcanbeconsideredprimitiveforbilateriananimals.
4.3TheFoxE,H,I,J,MandQ1subclassesareabsentfrom
allthreeecdysozoantaxa.Thesemayhavebeenlostontheecdysozoanlineageorhaveresultedfromduplica-tionsspecifictothedeuterostomelineage.
4.4TheFoxQ2subclassisapparentlymissinginverte-bratesandhaspresumablybeenlost.
4.5Thereisevidenceforsite-specificchangesinsub-stitutionratesduringFoxgenesubclassdiversification.
Acknowledgements
Wethankananonymousrefereeforhelpingtoimprovetheaccessibilityofthemanuscript.F.M.andS.M.S.acknowledgethesupportoftheBBSRC.
References
Baldauf,S.L.,1999.Asearchfortheoriginsofanimalsandfungi:compar-ingandcombiningmoleculardata.Am.Nat.154,S178–S188.
Table2aF.Mazetetal./Gene316(2003)79–89
89
Clark,K.L.,Halay,E.D.,Lai,E.,Burley,S.K.,1993.Co-crystalstructureoftheHNF-3/forkheadDNA-recognitionmotifresembleshistoneH5.Nature364,412–420.
Gajiwala,K.S.,Burley,S.K.,2000.Wingedhelixproteins.Curr.Opin.Struct.Biol.10,110–116.
Gajiwala,K.S.,Chen,H.,Cornille,F.,Roques,B.P.,Reith,W.,Mach,B.,Burley,S.K.,2000.Structureofthewinged-helixproteinhRFX1re-vealsanewmodeofDNAbinding.Nature403,916–921.
Gaucher,E.A.,Miyamoto,M.M.,Benner,S.A.,2001.Function-structureanalysisofproteinsusingcovarion-basedevolutionaryapproaches:elongationfactors.Proc.Natl.Acad.Sci.U.S.A.98,548–552.
Granadino,B.,Pe
´rez-Sanchez,C.,Rey-Campos,J.,2000.Forkheadtran-scriptionfactors.Curr.Genomics1,353–382.
Gu,X.,1999.Statisticalmethodsfortestingfunctionaldivergenceaftergeneduplication.Mol.Biol.Evol.16,1664–1674.
Gu,X.,VanderVelden,K.,2002.DIVERGE:phylogeny-basedanalysisforfunctional–structuraldivergenceofaproteinfamily.Bioinformatics18,500–501.
Hong,H.K.,Noveroske,J.K.,Headon,D.J.,Liu,T.,Sy,M.S.,Justice,M.J.,Chakravarti,A.,2001.Thewingedhelix/forkheadtranscriptionfactorFoxq1regulatesdifferentiationofhairinsatinmice.Genesis29,163–171.
Kaestner,K.H.,Knochel,W.,Martinez,D.E.,2000.Unifiednomenclatureforthewingedhelix/forkheadtranscriptionfactors.GenesDev.14,142–146.
Kaufmann,E.,Knochel,W.,1996.Fiveyearsonthewingsofforkhead.Mech.Dev.57,3–20.
Liberles,D.A.,2001.Evaluationofmethodsfordeterminationofare-constructedhistoryofgenesequenceevolution.Mol.Biol.Evol.18,2040–2047.
Mazet,F.,Shimeld,S.M.,2002.Theevolutionofchordateneuralsegmen-tation.Dev.Biol.251,258–270.
Nei,M.,Gojobori,T.,1986.Simplemethodsforestimatingthenumbersofsynonymousandnonsynonymousnucleotidesubstitutions.Mol.Biol.Evol.3,418–426.
Ogg,S.,Paradis,S.,Gottlieb,S.,Patterson,G.I.,Lee,L.,Tissenbaum,H.A.,Ruvkun,G.,1997.TheForkheadtranscriptionfactorDAF-16transducesinsulin-likemetabolicandlongevitysignalsinC.elegans.Nature389,994–999.
Page,R.,1996.TreeView:anapplicationtodisplayphylogenetictreesonpersonalcomputers.Comput.Appl.Biosci.12,357–358.
Schlake,T.,Schorpp,M.,Boehm,T.,2000.Formationofregulator/targetgenerelationshipsduringevolution.Gene256,29–34.
Shu,W.,Yang,H.,Zhang,L.,Lu,M.M.,Morrisey,E.E.,2001.Character-izationofanewsubfamilyofwinged-helix/forkhead(Fox)genesthat
areexpressedinthelungandactastranscriptionalrepressors.J.Biol.Chem.276,27488–27497.
Schubert,L.A.,Jeffery,E.,Zhang,Y.,Ramsdell,F.,Ziegler,S.F.,2001.Scurfin(foxp3)actsasarepressoroftranscriptionandregulatesTcellactivation.J.Biol.Chem.276,37672–37679.
Schultz,J.,Copley,R.R.,Doerks,T.,Ponting,C.P.,Bork,P.,2000.SMART:aweb-basedtoolforthestudyofgeneticallymobiledomains.NucleicAcidsRes.28,134–231.
Shimeld,S.M.,1997a.CharacterisationofamphioxusHNF-3genes:con-servedexpressioninthenotochordandfloorplate.Dev.Biol.183,74–85.
Shimeld,S.M.,1997b.Atranscriptionalmodificationmotifencodedbyhomeoboxandforkheadgenes.FEBSLett.410,124–125.
Siltberg,J.,Liberles,D.A.,2002.Asimplecovarion-basedapproachtoanalysenucleotidesubstitutionrates.J.Evol.Biol.15,588–594.
Smith,S.T.,Jaynes,J.B.,1996.Aconservedregionofengrailed,sharedamongallen-,gsc-,Nk1-,Nk2-andmsh-classhomeoproteins,mediatesactivetranscriptionalrepressioninvivo.Development122,3141–3150.Thompson,J.D.,Gibson,T.J.,Plewniak,F.,Jeanmougin,F.,Higgins,D.G.,1997.TheClustalXwindowsinterface:flexiblestrategiesformultiplesequencealignmentaidedbyqualityanalysistools.NucleicAcidsRes.25,4876–4882.
Toresson,H.,Martinez-Barbera,J.P.,Bardsley,A.,Caubit,X.,Krauss,S.,1998.ConservationofBF-1expressioninamphioxusandzebrafishsuggestsevolutionaryancestryofanteriorcelltypesthatcontributetothevertebratetelencephalon.Dev.GenesEvol.208,431–439.Weigal,D.,Ja¨ckle,H.,1990.Theforkheaddomain:anovelDNAbindingmotifofeukaryotictranscriptionfactors.Cell63,455–456.
Yang,Z.,Nielsen,R.,2000.Estimatingsynonymousandnonsynonymoussubstitutionratesunderrealisticevolutionarymodels.Mol.Biol.Evol.17,32–43.
Yasui,K.,Saiga,H.,Wang,Y.,Zhang,P.J.,Semba,I.,2001.Earlyex-pressedgenesshowingadichotomousdevelopingpatterninthelanceletembryo.Dev.GrowthDiffer.43,185–194.
Yu,J.-K.,Holland,N.D.,Holland,L.Z.,2002a.Anamphioxuswingedhelix/forkheadgene,AmphiFoxD:insightsintovertebrateneuralcrestevolution.Dev.Dyn.225,289–297.
Yu,J.-K.,Holland,L.Z.,Jamrich,M.,Blitz,I.L.,Holland,N.D.,2002b.AmphiFoxE4,anamphioxuswingedhelix/forkheadgeneencodingaproteincloselyrelatedtovertebratethyroidtranscriptionfactor-2:ex-pressionduringpharyngealdevelopment.Evol.Dev.4,9–15.
Yu,J.-K.,Holland,N.D.,Holland,L.Z.,2003.AmphiFoxQ2,anovelwingedhelix/forkheadgene,exclusivelymarkstheanterioroftheam-phioxusembryo.Dev.GenesEvol.213,102–105.
因篇幅问题不能全部显示,请点此查看更多更全内容