PeterC.LockemannandJensNimis⋆⋆
FakultaetfuerInformatik,UniversitaetKarlsruhe(TH)
Postfach6980,D-76128Karlsruhe,GERMANY
[lockemann,nimis]@ipd.uka.de
Abstract.Layeredarchitecturesareaprovenprincipleforthedesignofsoftwaresystemsandcomponents.Thepaperintroducesalayeredrefer-encearchitectureforsoftwareagentswhichassignseachagentpropertytoselectlayers.Itdemonstrateshowthesamereferencearchitecturepro-videsaframeworkforadependabilitymodelthatlocatesthesourcesoffailuresandtheensuingerrorhandlingwithaspecificlayer,thusinte-gratingdependabilitydirectlyintothedesignofagents.
1Introduction
Dependabilityisnotoneofthefavoritetopicsintheagentliterature.Agentsneverseemtofail,atworsttheybehaveinwaysnotexpectedordesiredbytheirpeers.Butotherdisturbancesdooccur,disturbancesthatareoutsidethecontroloftheagentsystem.Considerajob-shopsituationwhereanorder-agentnegoti-ateswithseveralmachine-agentsfollowingtheContractNetProtocol.Supposethattheprotocolhasprogressedtothepointwheresomemachine-agentAsub-mitsanoffertotheorder-agent.Supposefurtherthattheorder-agent’sacceptmessageneverreachesmachine-agentAduetoacommunicationfailure.Ifnoprecautionsaretaken,machine-agentAforeverholdstheresourcesreservedfortheorder-agent,whiletheorder-agentnevergetshisorderfulfilled.Otherfail-uresareconceivableaswell.Supposethattowardstheendmachine-agentAbreaksdownwhileprocessingtheorder,ortakeslongerthanpromisedbecauseitencounterssomeshortage.Theorder-agentwouldbeexpectedtofindsomewayaroundtheproblemifitweredesignedtohandlesuchacontingency.
Generallyspeaking,wecallasoftwareagentdependableifitmaintainsitsserviceaccordingtospecificationsevenifdisturbancesoccurwithintheagent.Wecallamultiagentsystemdependableifitsconstituentagentsaredepend-ableandthesystemasawholemeetsitsobjectivesatalltimes.Toachievedependabilitytheagentormultiagentsystemmustincludeinternalmechanismsthatguaranteethatallservicesarerenderedaccordingtospecificationnomatterwhathappenstotheagent.
⋆⋆⋆
Thispaperisarevisedfullversionofashortpaperpresentedatthe5thInternationalJointConferenceonAutonomousAgentsandMultiAgentSystems(AAMAS’06).ThisworkwasinpartfundedbytheDeutscheForschungsgemeinschaft(DFG,Ger-manresearchcouncil)withinthepriorityresearchprogram(SPP)no.1083.
Engineersaretrainedtodesignsystemsthattakeintoaccountthatsomethingoutofthenormalcanoccur,andsoshouldsoftwareengineers.Ourintroductoryexampleseemstoindicatethatfailuresmayoccur,andoccuronvariouslevelsofabstractionallthewayfromtechnicalcommunicationtopeercollaboration.Correspondingly,theirremedieswillinvolvedifferentlevelsofabstraction.Oneconceptofsoftwareengineeringthatrecognizeslevelsofabstractionisthedesignofsoftwaresystemsintheformofalayeredarchitecture.Itisthemainthesisofthispaperthatifsuchanarchitectureexistsforsoftwareagentsoneshouldbeabletoassociateaspecificfailureanditsremedieswithanindividuallayer,andasaconsequencetodevelopawell-structurederrormodelwhereeachkindoferrorhandlingcanbeisolatedandlocalized.
Tosupportthethesisthepaperproceedsasfollows.Section2introducesandmotivatesalayeredreferencearchitectureforsoftwareagents.Section3presentsageneraldependabilitymodelandtailorsittothespecificsofalayeredarchitecture.TheresultsofthesetwosectionsarefusedinSection4intoalayereddependabilitymodelforagents.Section5discussessomenovelsolutionswithinthemodel.Section6concludesthepaper.
2
2.1
ReferenceArchitecture
AgentProperties
Wefollowthewidelyaccepteddoctrinethatthenon-functionalpropertiesdecidetheinternalorganizationofasoftwarecomponent[1].FollowingWooldridgetherearefourpropertiesthatanyagentshouldmeetatitsminimum[2]:(1)Asoftwareagentisacomputersystemthatissituatedandcontinuously
operatesinsomeenvironment.
(2)Asoftwareagentoffersausefulservice.Itsbehaviorcanonlybeobservedby
itsexternalactions.
(3)Asoftwareagentiscapableofautonomousactioninitsenvironmentinorder
tomeetitsdesignobjectives,i.e.,toprovideitsservice.
(4)AsacorollarytoProperty3,theautonomyofasoftwareagentisdetermined,
andconstrained,byitsowngoals,withgoaldeliberationandmeans-endas-sessmentaspartsoftheoveralldecisionprocessofpracticalreasoning.Properties3and4lettheobservedbehaviorappearnon-deterministic—ormorebenignly—flexibleand,hence,setsoftwareagentsapartfromobject-orientedsoftware[3].Wooldridgeclaimsthatflexibilityisneededwhentheenvi-ronmentappearsnon-deterministicaswell.Hecallsasoftwareagentintelligentifitiscapableofoperatinginanon-deterministicenvironment,andassociatesfourmorepropertieswithit:
(5)Anintelligentsoftwareagentisreactive,thatis,itcontinuouslyperceivesits
environment,andrespondsinatimelyfashiontochangesthatoccur.
(6)Anintelligentsoftwareagentachievesaneffectivebalancebetweengoal-directedandreactivebehavior.
(7)Anintelligentsoftwareagentmaybeproactive,thatis,taketheinitiativein
pursuanceofitsgoals.
(8)Anintelligentsoftwareagentmayhavetopossesssocialability,thatis,is
capableofinteractingwithotheragentstoprovideitsservice.Wenotethat—apartfromProperty8—Wooldridge’spropertiessaylittleaboutmultiagentsystems.Therefore,weextendProperty4toincludeamongthegoalsboththeagent’sownandthoseoftheentireagentcommunity.Otherpropertiessuchastechnicalperformanceandscalabilityofmultiagentsystemsareimportantbutwillnotbeconsideredintheremainder.2.2
Layering
Itwouldbemostconvenientifwecoulddevelopanarchitecturethatcoversallagentsnomatterwhattheirserviceandwhereonlycertaindetailsmuststillbefilledinforaconcreteagent.Suchanarchitectureiscalledanarchitecturalframeworkorareferencearchitecture.Itprovidesasetofguidelinesforderivingaconcretearchitectureand,consequently,canonlybedefinedonthebasisofgenericproperties.Andindeed,theeightpropertiesfromaboveapplytoanagentnomatterwhatitsdomainis.
Referencearchitecturesfollowspecificarchitecturalpatterns.Oneofthemostpopularpatternsisthelayeredarchitecture.Anarchitectureislayeredifonecanarrangethetwosuchthatthefunctionalityandthequalities(non-functionalproperties)ofamoreabstracthigherlayercanbeconstructedfromthosefoundonthemoredetailedlowerlayers[4].Thelayerspatternhasinthepastbeenappliedtoagentsaswell(e.g.[5,6]).However,theproposalsalwaysseemspecial-izedtowardstheBDIframework,donot—asisusualfordistributedsystems—separateinternalprocessingandcommunication,andavoidthelayersforthecomputinginfrastructure.Ourproposalbelowintendstobemoregeneral.
SinceProperties1to4aretheonessharedbyallagents,ourgoalistodeterminethebasiclayeredstructureofourreferencearchitecturefromtheseproperties.Properties5to8ofintelligentagentsshouldthenhaveaninfluenceonlyontheinternalsofsomeofthecommonlayers.
Further,wedivideeachlayerintotwoparts:Onepartreflectstheinformationthatneedstobeprovidedtotailorthearchitecturetoaspecificdomain,whiletheotherliststhegeneral(domain-independent)mechanismsforprocessingthisinformation.
Wenoteinpassingthatlayeredarchitecturesarefirstofalldesignarchitec-tures.Duringimplementationonemayelect,forperformancereasons,toadd,fuse,omitorrearrangecomponentsandlayers.2.3
IndividualAgentArchitecture
Properties1and2suggestafirstdivision,withalowerpartprovidingacom-putinginfrastructureandtheupperpartreflectingtheservicecharacteristics.
Inaddition,properties1and3indicatethatthelowerpartbefurthersubdi-videdintotwolayersL1andL2.L1shouldsupplyalltheingredientsneces-saryforanexecutablecomponent,suchasoperatingsystemsfunctionality,datamanagementfacilitiesasoffered,e.g.,byrelationaldatabasesystems,anddatacommunicationservicessuchasHTTPorIIOP.L1mustalsoincludethesensorsandeffectorsnecessaryfortheinteractionwiththeenvironment.LayerL2,then,addsthemechanismsspecificallygearedtowardsagents.Oftentheyconstituteacentralpartofagentdevelopmentframeworks.Takeasanexamplethelifecyclemanagementofanagent(Property1)ortheindividualcontrolthreadsforeachagent(Property3).
Theupperpartcanbesubdividedaswell.TomeetProperty2theagentmusthavesomeunderstandingofitsenvironment,i.e.,theapplicationdomainanditsactiveplayers.TheunderstandingisreflectedonlayerL3bytheworldmodeloftheagent.Tolinktheworldmodelandthesensedobservations,andtotranslatethechangestoeffectedactions,theagentmaymakeuseofontologiesthatreflecttheoverallcommunityknowledge.
AnexthigherlayerL4realizestheagentbehavioraccordingtoProperty4.Thelayermustincludeallmechanismsnecessaryforgoaldeliberationandmeans-endassessment.Goaldeliberationhasmoreofastrategiccharacter,means-endassessmentismoretacticalinnatureandresultsinaselectionamongpotentialactions.SincebothmakeuseoftheworldmodeloflayerL3,therulesforprac-ticalreasoningandtheworldmodelmustmatchinstructure,e.g.,distinctionbetweenfacts,goalsandmeans,andinformalization,e.g.,useofasymbolicrep-resentation.WerestrictL4totheindividualgoals.AseparatelayerL5derivesitsactionsfromthemoreabstractsocialandeconomicalprinciplesgoverningthecommunityasawhole.Fig.1summarizesthereferencearchitectureforagents.2.4
LocatingAgentIntelligence
Giventhelayersofthereferencearchitecture,wemustnowassignProperties5to8ofintelligentagentstotheselayers(Fig.1).
Property5—agentreactivity—andProperty6—effectivebalancebetweengoal-directedandreactivebehavior—affectlayersL3andL4:Thetwomayin-teractbecausethebalancemayresultintherevisionofgoalsandmaythusalsoaffecttheworldmodel.
Proactivityofanagent(Property7)iscloselyrelatedtoautonomyand,hence,isalreadytechnicallysolvedbytheruntimeenvironmentoflayerL2.Havingcontroloveritsownprogressionasaresultofreconcilingitslong-termandshort-termgoalsimpliestheneedforadditionalmechanismsonlayerL4.Property8primarilyaffectslayerL5:Whilepursuingcommongoalstheagentmaycommunicatewithotheragentstoelicitcertainresponsesortoinflu-encetheirgoals,i.e.,itmaytakeamoreproactiverolethanjustadjustingitsownbehavior.Property8reliesonlayerL2withitsdirectoryservicesorspe-cificinteractionprotocolsandcommunicationformatssuchasspeech-actbasedmessages.
conversation protocol(social coordination)economical coordinationP8(L5) agentinteraction
reasoning enginegoal deliberation action selectionP2P5(L4) agentP6behaviorP7P2(L3) ontology-P5P6based domain
modelP1(L2) agent-P3specific infrastr. servicesP1(L1) system
environmentbase services
message encodingontology services,e.g. management, transformations,…agent platform,e.g. runtime environment, …world modelmessage transportperceptions and actions data communicationcomputing infrastructure,e.g., data management,operating system, …action partsensors and effectorscoordination partdomain independencedomain adaptationFig.1.Agentimplementationarchitectureforamulti-agentsystem
WenoteinpassingthatwewereabletodemonstratethehighergeneralityofourreferencearchitecturebyrefininglayersL3throughL5toBDI[7],andbyrelatingtheimplementationarchitecturesofJADEX[8]andInteRRap[9]directlytoit[10].2.5
IncorporatingAgentInteraction
Amultiagentsystemisadistributedsystemwheresystemcontrolisdistributed,dataisdecentralized,andagentsarelooselycoupled.Hence,thebasicmecha-nismsforsystemcontrol,theinteractionprotocol,mustitselfberealizedinadistributedfashionacrosstheagents.Sincecommunicationbetweentheagentsisasynchronous,thecommunicationmechanismismessageexchange.
Asiswell-knownfromtelecommunications,thecorrespondingcommunica-tionsoftwareineachparticipantfollowsthelayerspattern[4]and,hence,isreferredtoasaprotocolstack.Takingourcuesfromthewell-knownISO/OSImodel[11],itslowerfourlayers—thetransportsystem—areoblivioustothepurposeofthemessageexchangeandmaythusbecollapsedintolayerL1.Pur-posefulcommunicationisreflectedbyProperty8sothatinteractionprotocolsareprimarilyreflectedonlayerL5.Inbetween,layerL2accountsforthesequenceofmessageexchangesbetweentheagents.Interactiondependsonsomecommonunderstandingamongtheagents,i.e.,asharedworldmodelandontology.LayerL3thusmustprovidesuitablemechanismsforpropermessageencoding.Bycon-trast,layerL4controlsthebehavioroftheindividualagentand,hence,doesnotseemtocontributeanythingtothecoordinationperse.
Fig.1includestheprotocolstackinthereferencearchitecture.
3
3.1
ADependabilityModel
FailuresandDependability
Webaseourconceptualdependabilitymodelon(someof)thedefinitionsofLaprie[12].
Acomputersystem,inourcaseamultiagentsystem,isdependableifitmaintainsitsserviceaccordingtospecificationsevenifdisturbancesoccurthatareduetoeventsendogenoustothesystemsuchthatreliancecanjustifiablybeplacedonthisservice.Theservicedeliveredbyasystemisthesystembehaviorasitisperceivedbyanotherspecialsystem(s)interactingwiththeconsideredsystem:itsuser(s).Theservicespecificationisanagreeddescriptionoftheexpectedservice.Asystemfailureoccurswhenthedeliveredservicedeviatesfromthespecifiedservice.
Thecause—initsphenomenologicalsense—ofanyfailureisa(unintentional)fault.Weshouldmakesurethatonlyveryfewfaultsoccurinanagent(faultavoidance).Afaultmaynotalwaysbevisible.Ifitdoes,i.e.,ifitisactivatedinthecourseofprovidingaservice,itturnsintoaneffectiveerror.Forexample,aprogrammingfaultbecomeseffectiveuponactivationofthemodulewheretheerrorresidesandanappropriateinputpatternthatactivatestheerroneousinstruction,instructionsequenceorpieceofdata.Onlyeffectiveerrorscanbesubjecttotreatment.
Adependabilitymodeldescribestheservice,theeffectiveerrorsoccurringwithinthesystem,whichonesaredealtwithandhow(errorprocessing),andwhat,ifany,theensuingfailuresare.3.2
DependabilityModelforaLayer
ItfollowsfromSection2thaterrorprocessingultimatelytakesplacewithinindividualagents.Hence,tomakeourmodeloperationalweneedtorefinehowprocessingoferrorstakesplaceintheagent.Theprocessingwillhavetotakenoteofthenatureoftheeffectiveerrorwhichinturnhassomethingtodowiththenatureoftheunderlyingfault.
Whileallfaultsweconsiderareendogenoustotheentiresystem,fromtheperspectiveoftheindividualagentsomemaybeexogenous,henceforthcalledexternal.Theotherfaultsareinternal.Forexample,ifweconsideralayerinthereferencearchitectureofSection2.3,aninternalfaultmaybeduetoaprogrammingfaultorfaultyparametersettingswithinthecodeofthatlayer.Externalfaultsareeitherinfrastructurefailures,i.e.,failuresofthehardwareandsoftwaretheagentdependsontofunctionproperly(basicallylayersL1andL2),orpeerfailures,i.e.,faultsfromagentsonthesamelevelofservice,thatfailtodeliveronarequestduetoconnectionloss,theirowntotalfailure,orinabilitytoreachthedesiredobjectiveduetounfavorableconditions(unfavorableoutcome,seePleischandSchiper[13]).
Errorprocessingmayitselfbemoreorlesssuccessful,andthiswillaffecttheresultofaservicerequest.Fromtheservicerequestor’sviewpointthereturned
statesfallintodifferentclasses(Fig.2).Atbest,thefaultmayhavenoeffectatall,i.e.,thestatereachedafterservicingtherequestisidenticaltotheregularstate.Orthestateisstilltheoldstate,i.e.,thestatecoincideswiththestatepriortotherequestbecausethefaultsimplycausedtherequesttobeignored.Abitworse,servicingtherequestmayreachastatethatwhilenolongermeetingthespecificationmaystillallowtheservicerequestorasuitablecontinuationofitswork(sustainablestate).Allthreestatesmeetthedefinitionofdependability.Atworst,theoutcomemaybeastatethatisplainlyincorrectifnotoutrightdisastrous.
initialstateundisturbedservice execution/ fault resiliencefaultcontainmentold state equivalent to original statefaultmitigationerrorexposureregular statesustainable stateincorrect stateFig.2.StatesafterserviceprovisionWeintroducethefollowingnotions.Wespeakoffaulttoleranceifaservicereachesoneofthethreedependabilitystates.Specifically,underfaultresiliencethecorrectstateisreached,underfaultcontainmenttheoldstateandunderfaultmitigationthesustainablestate.Otherwisewehaveerrorexposure.Fig.3relatesthedependabilitymodeltothelayeredreferencearchitecture,byshowingtwolayerswherethelowerlayerprovidesaservicerequestedbytheupperlayer.Clearly,faultresilienceandfaultcontainmentfalltotheproviderlayer.TechniquestobeusedareindicatedinFig.3byenclosingrectangles.Faultcontainmentisachievedbyrecovery.Typicaltechniquesforfaultresiliencearerecoveryfollowedbyretry,orservicereplication.Faultmitigationrequiressomeactionontheprovider’spartaswell,suchaspartialrecovery.Sincefaultcontainment,faultmitigationandfaultexposureleadtoirregularstates,therequestormustbeinformedbyerrorpropagation.Anoldorsustainablestateiswell-definedandthusresultsinacontrolledfault.Inthefirstcasetherequestorwillsimplyresumeworkinsomeway,inthesecondtherequestormayhavetotransferbycompensationtoastatefromwheretoresumeregularwork.Errorexposureleavestherequestorwithnochoicebuttoinitiateitsownerrorprocessing.
Ifnofaultisdetectednormalprocessingtakesplaceinthelayer(notshowninFig.3).Noticethatthisdoesnotimplythatnofaultoccurred,justthatitremainsunprocessedassuchanditisuptoahigherlayertodetectit.
resumecontrolled faulterror propagation →failurefault resilienceregular statefault mitigationfault containmenterror exposureold statesustainable stateincorrect statecompensateexternal faultL (n+1)L nreretryplicationerror processingryrecovefaultexternal faultpeerfailureinfrastructurefailureinternal faultFig.3.Dependabilitymodel:Errorprocessingandresponsibilities
ryvecore4
4.1
LayeredDependabilityModel
CompressedLayerModel
Fig.3demonstratesthaterrorprocessingmaynotalwaysbesuccessfulwhenconfinedtoasinglelayer.Insteaditmustoftenbespreadtohigherlayerswhereawidercontextisavailable,e.g.,asequenceofstatementsratherthanasingleone,oralargerdatabase.Consequently,alayeredarchitectureshouldhelptoorganizeandlocalizethevariousdependabilityissues.Inparticular,weusethelayeredreferencearchitectureofSection2fororganizingthedependabilityofagents.
InordertodiscusstheresponsibilitiesofeachlayerweuseanabstractionofthedependabilitymodelofFig.3thatborrowsfromtheIdealFaultToler-antComponentofAndersonandLeeforsoftwarecomponentarchitectures[14,p.298].
AsFig.4illustrates,alayercanbeinoneoftwooperationmodes.Duringnormalactivitythecomponentreceivesservicerequestsfromlayershigherup,eventuallyaccomplishesthemusingtheservicesofthenextlowerlayercompo-nentsandreturnsnormalresponses.Itmayalsodrawontheservicesofothercomponentsonthesamelayerorreceiveservicerequestsfromsuchcomponents.Ifthelowerlayerfailstoservicetherequest,thiscanbeinterpretedonthecurrentlayerastheoccurrenceofafault.Themodechangestoabnormalactivitywherethefaultisinterpretedandtreated.ErrorprocessingfollowsthemodelofFig.3.Propagatederrorsaresignaledasafailuretothenexthigher
servicerequestnormal serviceresponsefaultfault resilience: return to normal operationerror propagation: failurelayer n+1servicerequestnormal serviceresponseservicerequestnormalservice operationexceptionspeer failurenormal serviceresponseabnormal activity(errorprocessing)faultlayer nfailure due to error propagationlayer n-1Fig.4.Compresseddependabilitymodelforasinglelayer
layer.Otherwisethelayerreturnstonormalmodeandtriestoresumenormaloperation.
Abnormalactivitymodemayalsobeenteredduetoabnormaleventsonthesamelayer.Onesucheventisamalfunctionwithinthesamecomponent—aninternalfault.Atypicalmechanismforsignalingsuchaneventis(inJavaterminology)bythrowingexceptions.Exceptionscanalsobeusedtomodelasituationwhereafaultwasnotactivatedfurtherdownandthusremainsunrecog-nizedwhilecrossingalayerinnormalmodeandmovingupwardsasanormalserviceresponse.Onlyonalayerhigherupthefaultisdetectedassuch.
Anotherkindofabnormaleventispeerfailures.Iftheyarerecognizedassuchtheyarebeingdealtwith,otherwisetheypropagateupwardsaspartofanormalserviceresponseonlytoinduceanexceptionhigherup.
Fortherestofthissectionwegiveastructuredoverviewofthevariousfaultsthatmayoccuroneachlayer,andshortsummariesofsomeoftheapproachestocopewiththem.Fig.5depictsfailuresandapproachesandthussummarizesthefollowingsubsections.4.2
SystemEnvironmentBaseServices(L1)
LayerL1encompassesallservicesgeneraltoacomputingenvironmentinclud-ingdatacommunication.Consequently,allfailuresoccurringareinfrastructurefailures.
Theindividualagentisaffectedbyhardwareandoperatingsystemfailures.Operatingsystemshaveabuilt-independability,usuallybyredundancy,thatallowsthemtorecoverfromhardwarefailures.Theoperatingsystemwillshowafailureonlyifsuchmeasuresareineffective.Ifthefailureiscatastrophicthesystemwillstopexecutingaltogether.Ifthefailureisnon-catastrophicsuchasexcessivewaitingorunsuccessfuldiskoperation,afailurewillbepropagatedupwardswithadetaileddescriptionofthecause.
Similarly,mostofthefaultstypicalforclassicaldatacommunication(e.g.,packageloss/duplication/reordering,ornetworkpartition)areinternallyresolvedbyanyofthenumerousprotocols,i.e.,moderndatacommunicationisinherently
servicerequestnormal serviceresponseservicerequestservicerequestnormal serviceresponseservicerequestservicerequestnormal serviceresponseservicerequestrecoverednormalservice operationnormal service resp.L5: agent interaction, i.e.fault avoidance/resilience byverifiable protocol design models,social and/or economical robust mechanisms, trust mgmtcoordination
undependablepeersrecovereduncorrectable interoperability failures, e.g. network split and unreachable peers (from L1),unconvertible ontology semantics (from L3)normalservice operationnormal service resp.fault tolerance by recovery management and/or uncertainty managementinterferingautonomouspeer actionL4: agent behavior, i.e. knowledge processing for goal deliberation and action selection
recoveredcatastrophic run-time environmentfailures (from L1/L2)normalservice operationnormal service resp.recoveredfault tolerance byontology termmeaning negotiatione.g. ontology mismatches,(de)coding failuresL3: ontology-based
domain model, i.e. ontology mgmt and translation, agent knowledge mgmt,…
(failures from infrastructure layersare passed through L3 directly to L4 or L5 resp.)normalservice operationservicerequestnormal servicee.g. faulty message sequences,responseout-dated service registrationsrecoveredL2: agent-specific infrastr.
fault tolerance by platformservices, i.e. message
mechanisms, like active directorieshandling, service discovery, or transactional recovery agent mobility,…
non-catastrophic failures, e.g.message losses,connection breakdowns,…normalservice operationfault tolerance by redundant hardware,robust data comm. protocols,…L1: system environment
base services, i.e. hardware, operating sys., data mgmt, data communication,…
hw failures, e.g. unsuccessful disk operationsdata transmission failures, e.g. by package losses
Fig.5.Dependabilityinthelayeredagentarchitecture
dependable(seeforexample[15]).Thisleavesveryfewfailurestobepropagatedupwards,suchasnetworkdelays,connectionloss,non-correctabletransmissionfailures.Inparticular,acatastrophiceventatoneagentcanonlybedetectedbyotheragentswhereitmanifestsitselfasanunreachabledestination.Somedependabilityaspectsaresubjecttoagreementandbecomequality-of-service(QoS)parameters,e.g.,transmissionreliabilitywithrespecttolossanddupli-cation(at-least-once,at-most-once,exactly-once,besteffort),ororderpreser-vation.Qualitiesoutsidetheagreementmustbecontrolledonthenexthigherlayerandifdeemedunsatisfactorywillraiseanexceptionthere.4.3
Agent-specificInfrastructureServices(L2)
LayerL2coverstheinfrastructureservicesthatarespecializedtowardstheneedsofagents.Typicalplatformsareagentdevelopment/runtimeframeworkslikeJADE[16]orLS/TS(LivingSystems®TechnologySuite[17]).
Theforemostplatformmissionisagentinteroperability.Typicalservicescon-cernmessagehandlingandexchange,tradingandnaming,oragentmobility.Togiveanexample,messagehandlingisintermsoftheFIPAACLmessageswiththeirnumeroushigher-levelparameters,whereinparticularthemessagecontentmaybeexpressedintermsoftheFIPAACLcommunicativeacts.NowtakeanuncorrectabledatatransmissionfailureonlayerL1.Itcanresultinamessagelossontheplatformlayer.Someofthemulti-agentplatformssuchasJADEal-lowtospecifyatime-outintervalwithinwhichareplytoagivenmessagemustbereceived,andonceitexpiresinvokeanexceptionmethodsuchasaretrybyrequestingaresendoftheoriginalmessage.
Anotherimportantfunctionistrading,locatinganagentthatofferstheser-vicesneededbyanotheragent.TheyellowpagesfunctionalityofFIPA’sDirec-toryFacilitatorimplementssuchafunction.AfaultinternaltolayerL2maybeaninvalidagentserviceregistration.OutdatedentriescaninternallybeavoidedbyanactiveDirectoryFacilitatorthatperiodicallychecksforavailabilityoftheregisteredservices.Otherwisetheupperlayersmustbenotifiedofthefailure,wheremostprobablythecoordinationlayer(L5)wouldhavetohandleitbyfindinganalternativepeertoprovideasimilarservice.4.4
Ontology-basedDomainModel(L3)
FromlayerL3onupwardsthedomainsemanticscomeintoplay.AsidefromprogrammingfaultstheonlyproblemsthatmayariseonlayerL3arepeerfailuresinconnectionwithontologies.Forexample,inanexchangedmessageapartneragentmaynotsharetheontologyofagivenagentsothatthereisamismatchinthemeaningattributedtoacertainterm.Providedthemismatchcanbedetected,onecouldtryapproacheslikeMenaetal.[18]thattrytonegotiatethemeaningoftheontologicaltermsduringthecourseoftheagents’interaction.Butevenifsenderandreceiversharethesameontology,differentcodingintoanddecodingfromthetransportsyntaxcanbeasourceoferrors.4.5
AgentBehavior(L4)
LayerL3dealsforemostwithstaticaspects.Bycontrast,layerL4dealswiththeagentfunctionalityand,hence,withthedomaindynamics.Therefore,ithasbyfarthemostcomplicatedinternalstructureofalllayers,thatisbasedoncertainassumptions,i.e.,thewayknowledgeandbehavioroftheagentsarepresentedandprocessed.Clearlythen,layerL4shouldbethelayerofchoiceforallfaults—manyoriginatingonlayerL2—whosehandlingrequiresbackgroundknowledgeofthedomain,thegoalsandthepastinteractionhistory.Consequently,mostoftheerrorprocessingwillfallundertheheadingoffaultresilience.Thisistrueevenforcatastrophicinfrastructurefailures.Supposetheagenttemporarilylosescontrolwhenitsplatformgoesdown,butregainsitwhentheplatformisrestarted.Recoveryfromcatastrophicfailureisusuallybybackwardrecoveryontheplatformlayer.Theresultingfaultcontainmentshouldnowbeinterpretedinlightoftheagent’sprioractivities.Ifthesewerecontinuouslyrecordedtheagent
mayadditsownrecoverymeasures,orjudgeitselftobesomewhatuncertainofitsenvironmentbutthenacttoreducetheuncertainty.4.6
AgentInteraction(L5)
WhereasLayerL4isthemostcriticalfordependablefunctioningofthesingleagent,layerL5mustensurethedependableperformanceoftheentiremultia-gentsystem.Werefertothecoordinatedinteractionofagentsasaconversation.Conversationsfollowa—usuallypredefined—conversationprotocol.Anumberofapproachesdealwiththisissuebysupportingtheprotocoldesignprocess.Pau-roballyetal.[19],NodineandUnruh[20],GalanandBaker[21]andHannebauer[22]presentapproachesinwhichtheyenrichaspecificationmethodforconver-sationprotocolsbyaformalmodelthatallowsthemtoensurethecorrectnessoftheprotocolexecutionanalyticallyorconstructively.
Atruntimeconversationsarethreatenedbyuncorrectableinteroperabilityfailuresthatarepassedupwardsfromlowerlayers.Theinfrastructurelayersmaypassupnetworksplitsandunreachablepeers,theontologylayerseman-ticdisagreements,theagentbehaviorlayerunforeseenbehaviorofpeeragents.LayerL5mayitselfdiscoverthatpeersrespondinmutuallyincompatibleways.Techniquesoffaultresilienceseemessentialiftheuserisnottobeburdened.Forexample,theagentsmaysearchforsimilarservices,ortheymayattempttoagreeonanewprotocolwithdifferentpeerrolesorpeers.4.7
ErrorsPassedthroughtotheUser
Amultiagentsystemmaynotbeabletoautomaticallyhandleeachandeveryfault,andprobablyshouldnotevenbedesignedtodoso,particularlyiftheoutcomeofthefailurehasadeepeffectontheMASenvironmentanditremainsunclearwhetheranyofthepotentialtherapiesimprovestheenvironmentalsit-uation.Consequently,acarefuldesignshouldidentifywhicherrorsshouldbepassedasfailuresuptowhatonecouldconceptuallyseeasanadditional“user”layer.
5
5.1
Evaluation
ExperimentalStrategy
Thethesisofthispaperisthatprovidedalayeredarchitectureexistsforsoftwareagentsoneshouldbeabletoassociateaspecificfailureanditsremedieswithanindividuallayer.Thedifficultywithprovingthehypothesisisitspremise:Findamultiagentsystemwhoseagentsfollowalayeredarchitecture,forexamplealongthelinesofFig.1.WhilewehaveshowninSection4howclassicalfailuresfitintothedependabilitymodel,whatwouldbeneededistogeneratemoresophisticatedfaults,tostudyhoweasyitwouldbetoincorporatetheirprocessinginthemodel,andtocomparethesolutionswithknownonesdevelopedelsewhere.Obviously,
thiswouldhavetobealong-lastingexperimentrequiringcooperationwithothergroups.
Still,wewishedtodofirstexperiments.Wedecidedtobaseournovelde-pendabilitymechanismsforsoftwareagentsontheclassicalinstrumentoftrans-actions.WemadeuseofaFIPAcompliantagentplatform(JADE[16])orsimu-latedthelayeredsub-architectureoftheBDIframework.InSections5.2through5.4weoutlinetheexperimentsanddiscusshowtheywouldfitintothelayereddependabilitymodel.
Forthelongerrunweproposethefollowingdesignofexperiments.StartfromalayeredimplementationofanMASandawell-structuredapplication,e.g.,forjobshop-schedulingfollowingthecontractnetprotocol.Enrichthesystembyasimulationenginethatallowstoinduceallkindsofdisturbancesonthedifferentlayers.Sincethelayeredimplementationshouldallowtoplugdifferentdepend-abilitymechanismsintotheMAS,studywhichdisturbancescanbehandledbythedifferentdependabilitymechanismsandwhichfailuresaretobepassedupwardsalongthelayers.5.2
L2:RecoverybyTransactionalConversations
Awell-knownabstractionofbehaviorthatincludesrecoveryisthetransaction.InitspurestformthetransactionhastheACIDproperties(see[23]).Inparticular,itisatomic,i.e.,isexecutedcompletelyornotatall,henceinthefailurecase,assumesbackwardrecovery;anditisconsistent,i.e.,ifexecutedtocompletionitperformsalegitimatestatetransition.Transactionsystemsguaranteethesepropertieswithoutanyknowledgeofthealgorithmsunderlyingthetransactions.Consequently,recoveryisalayerL2technique.
Thetransactionalabstractionofaconversationisthedistributedtransaction[24].Conversationsspawnlocaltransactionsintheparticipatingagents,thatarecoordinatedbytheinitiatingagent.Recoveryisthenimplementedviaa2-Phase-Commit(2PC)protocol[23].EachagentmustincludearesourcemanagerthatlocallyenforcestheACIDpropertiesandmanagesaccesstothedataitems,andatransactionmanagerthatguarantees,viathe2PC,atomicityfordistributedtransactions.
Distributedtransactionsbenefitfromastandardizedarchitecturethatdeter-minesthebasicworkflowandtheinterfaces(see,e.g.,theX/OpenDTPRef-erenceModel[25])and,hence,canbeimplementedbycommercialproducts.TheyalsostandagoodchanceofbeingincorporatedintoadependableFIPA-compliant[26]agentdevelopmentframework.WedidsowhilemakinguseofthecommercialproductsOracle9iRDBMS1forresourcemanagerandORBacusOTS2fortransactionmanager(fortechnicaldetailssee[24]).Basically,were-liedontheFIPA-OSconceptoftasks.Roughlyspeaking,anagenthasaspecialtasktohandletheprotocolexecutionforeachconversationtypeitparticipates
12
OracleCorporation:http://www.oracle.com/databaseIONATechnologies:ORBacusObjectRequestBroker:http://www.orbacus.com
in.Toallowforastructuredagentdesign,FIPA-OSprovidestheabilitytosplitthefunctionalityofataskintoso-calledchild-tasks.
Transactionalconversationsareatechniqueforfaultcontainment.Nonethe-less,2PCisproblematicalforaMAS.First,2PChasanegativeeffectonagentautonomybecauseincaseofseriousdelaysornodefailuresallotheragentsareheldup.Second,2PCraisesadditionaldifficultiesinanopenenvironment.Con-sequently,restrictionsmustbeimposedandonemustcarefullyconsiderwhetheronecanlivewiththem.5.3
L3/4:CompensationbyBeliefManagement
Supposewedowithout2PC,withjusttheindividualagentsbeingtransactionalbuttheconversationasawholenot.Onecouldarguethattheagentsarecertainoftheirownstatebutuncertainofthestateoftheconversation.Thuswehaveacaseoffaultmitigation.SincetheBDIframeworkallowstocopewiththenon-determinismoftheenvironmentwherebeliefsstandfortheuncertain,fuzzy,incompleteknowledgeacquiredbytheagent,beliefsseemagoodmodeltoguidecompensation[27].
“Whenchangingbeliefsinresponsetonewevidence,youshouldcontinuetobelieveasmanyoftheoldbeliefsaspossible”[28,29].Howthisprincipleisturnedintoapracticalsolutionsmaygiverisetodifferentbeliefmodels.Ourapproachusestwonon-deterministictransformations[30].Revisionaddsanewproposition.Iftheinformationisincompatiblewiththepreviousstatethenoldercontradictoryinformationisremoved.Contractionremovesaproposition.Re-movalmaytriggerfurtherremovalsuntilnoincompatiblepropositioncanfurtherbederived.Forboththeremaybemorethanonewaytorestoreconsistency.Athirdtransformation,expansion,simplyaddstheobservationasanewbeliefnomatterwhetheritcontradictsexistingbeliefsornot.
Morespecifically,wemodelledindividualbeliefsasfuzzypropositionsoveradiscretesetof(notnecessarilynumerical)values.Thefuzzinessallowstomovefromanotionofstrictconsistencytoagradualconsistencydegree.Anagentcandynamicallyadaptitsconsistencydegreeovertime:arrivingfreshontoascene,itmightwanttoabsorbmany,possiblycontradictingbeliefs,whilebeingmorestrictforcontextsestablishedoveralongerperiodoftime.Fuzzifiedversionsoftheoperatorsintroducedabovemodifybeliefsetswhileconsideringwhethertheremainingconsistencyreachesatleasttheprescribeddegree.5.4
L4/5:CompensationbyDistributedNestedTransactions
Beliefmanagementinvolvestheindividualagentsinacompensation.Asanalter-native,onemaytrytoassociatethecompensationwiththeentireconversationand,hence,involveanentiregroupofagents.Asaprerequisite,theconversationmustbeunderstoodtosomedetail,andthisinturnpresupposessomeunder-standingoftheagentbehavior.Consequently,suchanapproachislocatedonlayersL4andL5.
Firstofall,then,weneedabehavioralabstractionforagentsthattakesthelayeredarchitectureintoaccount.Amodelthatreflectsthatanexternaleventmayspawnseveralinterrelatedactionsondifferentlayersisthenestedtransac-tion[31].Inthemodelatransactioncanlaunchanynumberofsubtransactions,thusformingatransactiontree.Subtransactionsmayexecutesequentially,inparalleloralternatively.Asubtransaction(ortheentiretransaction)commitsonceallitschildrenhaveterminated,andmakesitsresultsavailabletoallothersubtransactions.Ifoneofthechildrenfails,theparenttransactionhasthechoicebetweenseveralactions:abortthesubtransaction,backwardrecoverandretryit,orattemptforwardrecoverybylaunchingacompensatingsubtransaction.Toreflectaconversationthenestedtransactionsoftheparticipatingagentsmustbesynchronized.Eachtransactionisaugmentedbyspecialsynchronizationnodesthattriggereventsinthepartnertransactions[32].Thismechanismcannowalsobeusedtoincorporateaconversation-wideerrorcompensation.
Thedrawbackoftheapproachisthatthereisnolongeraclearseparationbe-tweenagentbehaviorandconversation.Consequently,conversationsaredifficulttoextractandadapt.
6Conclusions
Thepaperdemonstratesthatonecansystematicallyderivefromthepropertiesofintelligentsoftwareagentsalayeredreferencearchitectureforagentswherethepropertiescanbeassignedtospecificlayers.Weshowedthatthisarchi-tecturealsooffersaframeworkforlocatingfailureoccurrencesandtheensuingdependabilitymechanismsonspecificlayers.Thusthereferencearchitectureal-lowstoincludeinthedesignofmultiagentsystemsdependabilityrightfromthebeginning.Ourownworkseemstosuggestthattheapproachisindeedviable,althoughitmaynotalwaysbepossibletoconfinehandlingofaparticularfailuretojustonelayer.AsindicatedinSections5.2to5.4,thedetaileddependabilitytechniquesthemselvesstillraiseintricateissues.
References
1.Starke,G.:EffectiveSoftwareArchitectures(inGerman).Hanser,Munich,Ger-many(2005)
2.Wooldridge,M.J.:AnIntroductiontoMultiAgentSystems.Wiley,Chichester(2002)
3.Luck,M.,Ashri,R.,d’Inverno,M.:Agent-BasedSoftwareDevelopment.ArtechHouse,Inc.,Norwood,MA,USA(2004)
4.Buschmann,F.,Meunier,R.,Rohnert,H.,Sommerlad,P.,Stal,M.:Pattern-OrientedSoftwareArchitecture:ASystemofPatterns.JohnWiley&Sons,NewYorkCity,NY,USA(1996)
5.Kendall,E.A.,Pathak,C.V.,Krishna,P.V.M.,Suresh,C.B.:Thelayeredagentpatternlanguage.In:ConferenceonPatternLanguagesofPrograms(PLoP’97).(1997)
6.Jennings,N.R.:Specificationandimplementationofabelief-desire-joint-intentionarchitectureforcollaborativeproblemsolving.InternationalJournalofIntelligentandCooperativeInformationSystems2(3)(1993)289–318
7.Bratman,M.E.,Israel,D.J.,Pollack,M.E.:Plansandresource-boundedpracticalreasoning.ComputationalIntelligence4(4)(1988)349–355
8.Braubach,L.,Pokahr,A.,Lamersdorf,W.:Jadex:ABDIagentsystemcombiningmiddlewareandreasoning.InUnland,R.,Calisti,M.,Klusch,M.,eds.:SoftwareAgent-BasedApplications,PlatformsandDevelopmentKits.Birkhaeuser,Basel,Suisse(2005)143–1689.M¨uller,J.P.:TheDesignofIntelligentAgents:aLayeredApproach.Springer,Heidelberg,Germany(1996)
10.Lockemann,P.C.,Nimis,J.,Braubach,L.,Pokahr,A.,Lamersdorf,W.:Architec-turaldesign.InKirn,S.,Herzog,O.,Lockemann,P.C.,Spaniol,O.,eds.:Multia-gentEngineeringTheoryandApplicationsinEnterprises.InternationalHandbooksonInformationSystems.SpringerVerlag,Heidelberg,Germany(2006(toappear))11.Stallings,W.:Dataandcomputercommunications.7.edn.PearsonPrenticeHall,
UpperSaddleRiver,NJ(2004)
12.Laprie,J.C.:Dependablecomputingandfaulttolerance:conceptsandterminol-ogy.In:15thIEEESymposiumonFaultTolerantComputingSystems(FTCS-15).(1985)2–11
13.Pleisch,S.,Schiper,A.:Approachestofault-tolerantandtransactionalmobile
agentexecution—analgorithmicview.ACMComput.Surv.36(3)(2004)219–26214.Anderson,T.,Lee,P.A.:FaultTolerance:PrinciplesandPractice.Prentice/Hall
International,EnglewoodCliffs,NJ,USA(1981)
15.Halsall,F.:Datacommunications,computernetworksandopensystems.Addison-Wesley,Harlow,England(1998)
16.Bellifemine,F.,Poggi,A.,Rimassa,G.:Jade:aFIPA2000compliantagentde-velopmentenvironment.In:5thInternationalConferenceonAutonomousAgents(AGENTS’01),NewYork,NY,USA,ACMPress(2001)216–217
17.Rimassa,G.,Calisti,M.,Kernland,M.E.:Livingsystemstechnologysuite.Tech-nicalreport,WhitesteinTechnologiesAG,Zurich,Suiss(2005)
18.Mena,E.,Illarramendi,A.,Goni,A.:Automaticontologyconstructionfora
multiagent-basedsoftwaregatheringservice.InKlusch,M.,Kerschberg,L.,eds.:4thInternationalWorkshoponCooperativeInformationAgents,Heidelberg,Ger-many,Springer(2000)232–243
19.Paurobally,S.,Cunningham,J.,Jennings,N.R.:Developingagentinteraction
protocolsusinggraphicalandlogicalmethodologies.InDastani,M.,Dix,J.,ElFallahSeghrouchni,A.,Kinny,D.,eds.:WorkshoponProgrammingMulti-AgentSystems(ProMAS2003).(2003)1–10
20.Nodine,M.,Unruh,A.:Constructingrobustconversationpoliciesindynamicagent
communities.InDignum,F.,Greaves,M.,eds.:IssuesinAgentCommunication.Volume1916.,Heidelberg,Germany,Springer(2000)205–219
21.Galan,A.,Baker,A.:Multi-agentcommunicationsinjafmas.In:Workshopon
SpecifyingandImplementingConversationPolicies,Seattle,Washington(1999)67–70
22.Hannebauer,M.:Modelingandverifyingagentinteractionprotocolswithalgebraic
petrinets.In:6thInternationalConferenceonIntegratedDesignandProcessTechnology(IDPT-2002),Pasadena,USA(2002)
23.Weikum,G.,Vossen,G.:Transactionalinformationsystems:theory,algorithms
andthepracticeofconcurrencycontrolandrecovery.MorganKaufmann,SanFrancisco(2002)
24.Nimis,J.,Lockemann,P.:Robustmulti-agentsystems:Thetransactionalcon-versationapproach.In:1stInternationalWorkshoponSafetyandSecurityinMultiagentSystems(SASEMAS04),NewYorkCity,NY,USA(2004)
25.TheOpenGroup:DistributedTransactionProcessing:ReferenceModel,Version
3.TheOpenGroup(1996)
26.Poslad,S.,Charlton,P.:Standardizingagentinteroperability:TheFIPAapproach.
InLuck,M.,Mar´ık,V.,Step´ankov´a,O.,Trappl,R.,eds.:9thECCAIAdvancedCourseACAI2001,SelectedTutorialPapers.(2001)98–117
27.Lockemann,P.C.,Witte,R.:Agentsanddatabases:Friendsorfoes?In:9thIn-ternationalDatabaseApplicationsandEngineeringSymposium(IDEAS05),Mon-treal,Canada,IEEEComputerSoc.(2005)137–147
28.Harman,G.:ChangeinView:PrinciplesofReasoning.TheMITPress,Cambridge,
MA,USA(1986)
29.Katsuno,H.,Mendelzon,A.:Onthedifferencebetweenupdatingaknowledge
baseandrevisingit.InG¨ardenfors,P.,ed.:BeliefRevision.CambridgeUniversityPress,Cambridge,MA,USA(1992)183–203
30.Witte,R.:ArchitectureofFuzzyInformationSystems.PhDthesis,Universitaet
Karlsruhe(TH),Karlsruhe,Germany(2002)
31.Nagi,K.:Scalabilityofatransactionalinfrastructureformulti-agentsystems.In
Wagner,T.,Rana,O.F.,eds.:1stInternationalWorkshoponInfrastructureforScalableMulti-AgentSystems,Heidelberg,Germany,Springer(2000)266–27232.Nagi,K.:Modelingandsimulationofcooperativemulti-agentsintransactional
databaseenvironments.InWagner,T.,Rana,O.F.,eds.:2ndInternationalWork-shoponInfrastructureforScalableMulti-AgentSystems,Heidelberg,Germany,Springer(2001)
因篇幅问题不能全部显示,请点此查看更多更全内容