MiningMobileSequentialPatternsinaMobileCommerceEnvironment
Ching-HuangYunandMing-SyanChen,Fellow,IEEE
Abstract—Inthispaper,weexploreanewdataminingcapa-bilityforamobilecommerceenvironment.Tobetterreflectthecustomerusagepatternsinthemobilecommerceenvironment,weproposeaninnovativeminingmodel,calledminingmobilesequen-tialpatterns,whichtakesboththemovingpatternsandpurchasepatternsofcustomersintoconsideration.Howtostrikeacom-promiseamongtheuseofvariousknowledgetosolvetheminingonmobilesequentialpatternsisachallengingissue.Wedevisethreealgorithms(algorithmTJLS,algorithmTJPT,andalgo-rithmTJPF)fordeterminingthefrequentsequentialpatterns,whicharetermedlargesequentialpatternsinthispaper,fromthemobiletransactionsequences.AlgorithmTJLSisdevisedinlightoftheconceptofassociationrulesandisusedasthebasicscheme.AlgorithmTJPTisdevisedbytakingboththeconceptsofassocia-tionrulesandpathtraversalpatternsintoconsiderationandgainsperformanceimprovementbypathtrimming.AlgorithmTJPFisdevisedbyutilizingthepatternfamilytechniquewhichisde-velopedtoexploittherelationshipbetweenmovingandpurchasebehaviors,andthusisabletogeneratethelargesequentialpat-ternsveryefficiently.Asimulationmodelforthemobilecommerceenvironmentisdeveloped,andasyntheticworkloadisgeneratedforperformancestudies.Inminingmobilesequentialpatterns,itisshownbyourexperimentalresultsthatalgorithmTJPFsignifi-cantlyoutperformsothersinbothexecutionefficiencyandmemorysaving,indicatingtheusefulnessofthepatternfamilytechniquede-visedinthispaper.Itisshownbyourresultsthatbytakingbothmovingandpurchasepatternsintoconsideration,onecanhaveabettermodelforamobilecommercesystemandisthusabletoex-ploittheintrinsicrelationshipbetweenthesetwoimportantfactorsfortheefficientminingofmobilesequentialpatterns.
IndexTerms—Datamining,mobilecomputing,mobilesequen-tialpatterns,userbehavior.
Fig.1.Illustrativeexampleforamobiletransactionsequencewherecellsareunderlinedifitemsarepurchasedthere.
I.INTRODUCTION
T
HEEMERGENCEofpowerfulportabledevices,alongwithadvanceinwirelesscommunicationtechnologies,hasmadethemobileservicesavailable.Inthenearfuture,itisexpectedthattensofmillionsofuserswillcarrymobilephonesorportabledevicesthatusewirelessconnectiontoac-cessaworldwideinformationnetworkforbusinessorpersonalusefromanywhereatanytime,makingthemobilecommerce(MC)areality[1],[57],[58].Forexample,eNetworkWebExpress[19]enablesmobileuserstousecommercialWebap-plicationsoverwide-areawirelessnetworks(WANs).Bluetoothtechnology[20]allowsterminalsandcashregisterstotalkdi-rectlytoeachotherforthepurposeofmobilecommerce.The
ManuscriptreceivedMay2,2003;revisedJuly9,2004.Theworkwassup-portedinpartbytheNationalScienceCouncilofTaiwan,R.O.C.,underContractNSC93-2752-E-002-006-PAE.
TheauthorsarewiththeDepartmentofElectricalEngineering,NationalTaiwanUniversity,Taipei,Taiwan,R.O.C.(e-mail:mschen@cc.ee.ntu.edu.tw).DigitalObjectIdentifier10.1109/TSMCC.2005.855504
WirelessAccessProtocol(WAP)[21]bringstheMCenviron-mentaworld-widestandardforprovidingInternetcommuni-cationstodigitalmobilephones.InanMCenvironment,cus-tomerscanmakeanytransactionfromanywhereatanytimewiththepaymentmechanismprovidedbybanksorcreditcardcompanies[58].Inaddition,somekindofNokiamobilephonesprovidethewalletapplicationthatenablescustomerstogeteasyaccesstomobileservicesandtomakeconvenientonlinemo-biletransactions[2].Inthewallet,customerscanstoresensitivepersonalinformation,suchaspaymentandloyaltycarddetails,deliveryaddresses,andnotes,aswellasserviceprofiles.Inaddi-tion,withthewalletapplication,theNokiamobilephoneshavethecapabilityofstoringthetransactionswithmovingpatternsandpurchasingpatternsofcustomers.
Example1.1:Oneexamplescenarioenvisionedforamo-biletransactionsequenceisshowninFig.1,whereacustomermovesinthemobilecommerceenvironmentandmakestrans-actionsinthecorrespondingcellthroughthemobiledevice.Fig.1(a)showsthemovingpatternsofthiscustomerandthemobiletransactionsequencedataisrecordedinFig.1(b),whereforexample,itemi1waspurchasedwhenthecustomermovedtothecellA.
ItisimportanttonotethatsincecustomersaremovingalonganMCenvironmenttosearchfordesireditemstopurchase,theimplicationsfrommovingpatternsandpurchasepatternsareinfactentangled,andbothareofgreatimportanceforstudyingcustomerbehaviors.Clearly,thedistinctivefeaturesofknowl-edgediscoveryinanMCenvironmentincreasethedifficultyof
1094-6977/$25.00©2007IEEE
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT279
extractinginformationfromthemobiletransactionsequences.However,asthesemobilecommerceservicesarebecomingin-creasinglypopularnowadays,itisimperativetodeviseefficientalgorithmsforderivingcustomerbuyingbehaviortoimprovethequalityoftheseservices.Asaresult,thedesignanddevel-opmentofefficientminingalgorithmsforknowledgediscoveryinanMCenvironmentwhilefullyexploringtheintrinsicrela-tionshipbetweenmovingandpurchasepatternsistakenastheobjectiveofthispaper.ConductingtheminingonthemovingandpurchasepatternsofcustomersinanMCenvironmentiscalledtheminingofmobilesequentialpatterns(i.e.,largese-quentialpatterns)inthispaper.Inaddition,anovelknowledge,calledmobilesequentialrules,canbederivedfromthemobilesequentialpatternsforthemeasurementofcustomerpurchasebehaviorassociation.
Example1.2:FortheexampleshowninFig.1,thecustomerhasonekindofmovingpatternABCandtwokindsofpurchas-ingpatterns{A;t1andIfthereC;t9are}wheresufficientitemsetcustomerst1={i1having}anditemsett9={i2,i3}.thesamepatterns,themobilesequentialpatternisanimplica-tionoftheform{A;t1,C;t9}:ABC,whichmeansthatmostcustomersusuallypurchaseitemsett1incellAandthenpurchaseitemsett9incellCwiththespecificpathABC.Inad-dition,themobilesequentialruleisanimplicationoftheform{A;t1=⇒C;t9:ABCwhichmeansthatcustomerspur-chasingitemsett1incellAareusuallymovingalongpathABCtocellCforpurchasingitemsett9.Withthemobilesequentialrule,whenacustomerpurchasesitemsett1incellA,thecellularphonecompanycouldsendthecouponsofproducts(i.e.,itemi2anditemi3)initemsett9toboostthesalesthroughthebasestationsinthecellsA,B,orCinaccordancewiththeirbroad-castingschedules.Moredescriptionaboutmobilecommerceisavailablein[1].
ThedetailsofrelatedworksaregiveninSectionII-A.De-spitesomeeffortshavingbeenelaborateduponexaminingtheuserbehavior,noneofthepriorwork,tothebestofourknowl-edge,hastakenbothmovingandpurchasepatternstogetherintoconsiderationtomodelthecustomerbehaviorinamobilecom-merceenvironment.Thiscaninpartbeexplainedbythefactthatthecostisexpensivetotrackandlogdetailedmovementsofmobileuserstoday.1However,itisexpectedthatsuchcostwilldecreasesoonandthecellularphonewillbecomethepopularinterfaceoftheinterconnectionnetworksforaccessingvariousservices[57],thusjustifyingthepracticalityandnecessityofconductingmobilesequentialpatternmining.Itisunderstoodthattherecordsofcellsvisitedanditemspurchased,requiredforminingmobilesequentialpatterns,maybelongtodifferentcompanies,andforthesecompanies,theymayhavedifferentconsiderationsonusingtheirdatatoimprovethemobilecom-merceservicesprovided.Itshouldgowithoutsayingthatsuchdataanalysisshouldbedonesolelyforthepurposesofsystemandserviceimprovementsandshouldbeconductedinacontin-gentwaythatneitheranylawisviolatednoristheprivacyofcustomersintruded.Nevertheless,withthelegalityandprivacy
1The
costtolocateamobileuserisestimatedtobeaboutUS$0.01to
US$0.05eachtimeaccordingtoamajormobilephoneserviceprovider.
Fig.2.Notionofminingmobilesequentialpatterns.
issuesconsidered,theknowledgediscoveryfromtheMCdataisbelievedtobeanincreasinglychallengingtechnicalprob-lemwhichisofgreatpracticalimportancefortheevolvingMCtechniques.
Consequently,tobetterreflectthecustomerbuyingbehav-iorintheMCenvironment,weproposeaninnovativeminingmodelthattakesboththemovingpatternsandpurchasepatternsintoconsideration.Inessence,theminingofmobilesequentialpatternsaggregatestheconceptsonminingassociationrules,miningpathtraversalpatterns,andminingsequentialpatterns,andthusrequiresacombineduseofcorrespondingtechniques.ThenotionofminingmobilesequentialpatternsisshowninFig.2,wheretherelationshipamongtheseminingcapabili-tiesisdepicted.Howtostrikeacompromiseamongtheuseofvariousknowledgetosolvetheminingonmobilesequentialpatternsisachallengingissue.Asanefforttosolvethisprob-lem,wedeviseaprocedure,namelymobilesequentialpatternsMSPs),toconducttheminingofmobilesequentialpatterns.WiththedetailsdescribedintheSectionII-C,theprocedureMSPsplitstheproblemofminingmobilesequentialpatternsintofourphases,namely:1)thelarge-transactiongenerationphase;2)thelarge-transactiontransformationphase;3)thesequential-patterngenerationphase;and4)thesequential-rulegenerationphase.
Inthispaper,theperformancebottleneckisinphase3),i.e.,thesequential-patterngenerationphase.Byhavingdifferentprioritiesonthefactorsinvolvinglargeitemsets,traversalpathsandordersofpurchases,wedevisethreealgorithms(algorithmTJLS,algorithmTJPT,andalgorithmTJPF)todeterminemobilesequentialpatterns.First,algorithmTJLSisdevisedinlightoftheconceptofitemsetjoininginassociationrulesmining[6].However,aswillbeseenlater,withoutfullyutilizingthetraversalpathsofmobilesequentialpatterns,algo-rithmTJLStendstocountthesupportsofalotofout-of-pathsequentialpatterns(i.e.,thesequentialpatternswhichdonotstaywithinapath),thusdegradingtheperformance.Next,toeliminatetheout-of-pathsequentialpatterns,algorithmTJPTisdevisedbytakingboththeconceptsofassociation
280IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
rules[6]andpathtraversalpatterns[12]intoconsiderationandgainsperformanceimprovementbypathtrimming.However,algorithmTJPTincursthecomparisonalongthepathwithtimecomplexityO(P),wherePistheaveragepathlength,andmaygeneratesomeuncertaincandidatesequentialpatternswhichundesirablyrequirefurthersubpatternidentifications.Consequently,byfullyexploringtheintrinsicrelationshipbe-tweenmovingandpurchasebehaviorsofcustomers,algorithmTJPFisdevelopedinlightofthepatternfamilytechnique.Thepatternfamilyofapatternconsistsofthepatternitselfandallitssubpatternsgeneratedineachround.Inessence,thepatternfamilytechniqueisafilteringtechniquethatfullyexploitsthepurchasepatternsgeneratedintheintermediatestagesofmo-bilesequentialpatternmining.Forbetterreadability,wedeferthedetaileddescriptionofthepatternfamilytechniqueandthecorrespondingtheoreticalpropertiestoSectionIII.Itwillbeshownthatutilizingtheinformationofpatternfamily,algo-rithmTJPFcomparesthepathwithtimecomplexityO(1)andgeneratesfeweruncertaincandidatesequentialpatterns,thusre-ducingthecorrespondingcomputationaloverheadandmemoryconsumption.
Afterallmobilesequentialpatternsareobtained,themobilesequentialrulescanbederivedwithastraightforwardwayanditisdescribedclearlyinSectionII-C4.AsimulationmodelfortheMCenvironmentisdevelopedandasyntheticworkloadisgeneratedforperformancestudies.Byutilizingpatternfamilytechnique,TJPFisshowntobeabletodeterminelargese-quentialpatternsveryefficiently.Asvalidatedbythesyntheticworkload,itisshownbyourexperimentalresultsthatalgorithmTJPFsignificantlyoutperformsothersinboththeexecutionef-ficiencyandthememorysaving.Itisshownbyourresultsthatbytakingbothmovingpatternsandpurchasepatternsintocon-sideration,onecanhaveabettermodelforanMCsystemandisthusabletoexploittheintrinsicrelationshipbetweenthesetwocustomerbehaviors.
Thispaperisorganizedasfollows.PreliminariesaregiveninSectionII.InSectionIII,threealgorithms(TJLS,TJPT,andTJPF)aredevisedfordetermininglargesequentialpatterns.ExperimentalstudiesareconductedinSectionIV.ThispaperconcludeswithSectionV.
II.PRELIMINARIES
Inthissection,theproblemofminingmobilesequentialpat-ternsisdescribedinSectionII-A,therelatedworksaredescribedinSectionII-B,andtheprocedureofminingmobilesequentialrulesisoutlinedinSectionII-C.A.ProblemFormulation
Inthemobilecommerceenvironmentwhereitemsaresoldinvariouscells,customersmaymoveamongthecellstopur-chaseitemsofinterestwitheithertraditionalorelectroniccom-mercetradingmechanisms.Ineithercase,customerspayforitemsthroughthemobiledevicesandthepurchasingrecordsarelogged.LetN={n1,n2,...,ng}bebeasetasetofofcellsitemsinthesoldMCenvironmentandI={i1,i2,...,ih}inthatenvironment.Wearegivenadatabaseofmobiletransaction
TABLEI
PATTERNSANDTHEIRNOTATIONS
sequences,whereeachmobiletransactionsequenceconsistsofsequence-id,cellsvisited,andalistofitemsetspurchasedinthecorrespondingcells,orderedbycustomermovementsamongcells.
Apathisdenotedbyn1n2...ny,wherenj∈N,for1≤j≤y.Thus,thesequenceofcellsvisitedimplicitlyformsthepathofthemobiletransactionsequence.Inthispa-per,thediscoveredpatternsandtheirnotationaregiveninTableI.Atransaction,denotedasC;{i1,i2,...,ip},meansthatitemset{i1,i2,...,ip}wasboughtincellC,whereC∈N,and{i1,i2,...,ip}⊆I.Thus,thelistofitemsetspurchasedinthecorrespondingcellsimplicitlyformsthelistoftransactionsofthemobiletransactionsequence.Asare-sult,eachmobiletransactionsequencecontainstheinforma-tionofpathandalistoftransactions.GivenadatabaseDMofmobiletransactionsequences,theproblemofminingmo-bilesequentialpatternsistodiscoverthefrequentsequentialpatternssamongallmobiletransactionsequences.Asequentialpatternisrepresentedbytheformlistoftransactions:path,wherethetransactionsaremadealongthepath.Thesupportforasequentialpatternisdefinedasthenumberofmobiletrans-actionsequenceswhichsupportthissequentialpattern.Alargesequentialpatternisasequentialpatternwiththeminimumsupport(i.e.,asequentialpatternthatappearedinasufficientnumberofmobiletransactionsequences).
Thelengthofalargesequentialpatternisthenumberoftrans-actionsinthatlargesequentialpattern.Alargesequentialpatternoflengthkiscalledalargek-sequentialpattern.Thus,alarge1-sequentialpatterncanberepresentedbytheformtransaction:cell,wherethetransactionismadeinthecell.Notethateachtransactioninalargek-sequentialpatternmustmeetthemini-mumsupport.In[7],anitemsetwithminimumsupportiscalledalargeitemsetorlitemset.Similarly,wecallatransactionwithminimumsupportlargetransactionorL-transaction,whichcanberepresentedasC;tj,wheretjrepresentsalitemsetincellC.Thus,ifthetransactionC;{i1,i2,...,ip}hasthemini-mumsupport,thelitemset{i1,i2,...,ip}willberepresentedastjandtheL-transactionC;{i1,i2,...,ip}willberepre-sentedasC;tj.Sinceeachtransactioninalargek-sequentialpatternwillhavetheminimumsupport,alargek-sequentialpatterncanberepresentedas{x1,x2,...,xk}:n1n2...ny,wherexjisanL-transactionmadealongthepathn1n2...ny.Recallthatinassociationrules[6],alargeitemsetisafrequentlypurchaseditemset.Insequentialpatterns[7],alarge
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT281
sequenceisafrequentlypurchasedsetofitemsetsorderedbythepurchasetime.Intraversalpatterns[12],alargereferenceisafrequentlytraveledpath.Inthispaper,alargesequentialpatternisapatterncontaining:1)thefrequentlypurchaseditemset(meaningthattheitemsetinanL-transactionmusthavetheminimumsupport);2)thefrequentlypurchasedsetofitemsetsorderedbythepurchasetime(meaningthatthesetofL-transactionsinalargesequentialpatternmusthavetheminimumsupport);and3)thefrequentlytraveledpath(meaningthatthepathinalargesequentialpatternmusthavetheminimumsupport).B.RelatedWork
Recently,miningofdatabaseshasattractedagrowingamountofattentionindatabasecommunitiesduetoitswideapplica-bilitytostudyingthebuyingbehaviorsofcustomers[11],[18].Miningassociationrulesisemployedtodiscovertheimpor-tantassociationsamongitemssuchthatthepresenceofsomeitemsinatransactionwillimplythepresenceofotheritemsinthesametransaction[5].Afterthat,severaltechnologiesonassociationrulemininghavebeendevelopedincluding:1)al-gorithmimprovements[3],[6],[10],[15],[27],[32],[43],[68];2)constraint-based[25],[28],[46];3)incrementalupdating[9],[14],[30];4)multipleminimumsupports[35],[60];5)fre-quentcloseditemsets[45],[47],[67];and6)generalized[53],multilevel[23],intertransaction[56],quantitative[54],andmul-tidimensional[61],[62].
Miningsequentialpatternswasfirstintroducedin[7]forfindingtheintertransactionpatternsinthetraditionalretailingenvironments.Afterthat,severaltechnologiesonsequentialpat-ternmininghavebeendeveloped,including:1)algorithmim-provements[26],[39],[48],[66];2)constraint-based[36],[55],[65];3)incrementalupdating[33],[44],[69];and4)general-ized[55].
Severaltemporalassociationruleminingtechniquesaread-dressedin[8],[13],[29],and[41].Episodemininghasbeenstudiedin[31]and[38]fordiscoveringfrequentpatternsinasequenceoftimeevents.Dasetal.[17]investigatedtheprob-lemoffindingrulesrelatingpatternsinatimeseriestootherpatternsinthatseries,orpatternsinoneseriestopatternsinanotherseries.Miningseriesofintervaleventswasdiscussedin[59]fordiscoveringthetemporalcontainmentrelationshipsofeventsequences.Atemporallogicapproachisproposedin[42]forfindingtemporalpatterns.Miningpartialordersfromthesequentialdataisexploredin[37].Miningsegmentwiseperi-odicpatternsisdiscussedin[24].Miningasynchronousperiodicpatternsisinvestigatedwithinasubsequenceshiftedbydistur-bance[63].Searchingforpartialperiodicpatternsintime-seriesdatabasesisdiscussedin[22].
AstudyonefficientminingofpathtraversalpatternsforcapturingWebuserbehaviorwasconductedin[12].WEB-MINER[16]wasdesignedforminingWebusageassociationrulesandsequentialpatterns.SeveralWWWserverlogsareanalyzedin[50]forderivingthepathdistributionpatternsoftheWebusers.Withmininglongestrepeatedsubsequences,arobustmethodwasproposedin[51]forreducingthecomplexity
Fig.3.
Flowchartofthewholeprocedureofminingmobilesequentialpatterns.
whilepreservingthepredictabilityoftheWebusersurfingpaths.Recently,severalECWebsiteshavebeenusingrecommendedsystemsforanalyzingthecustomerbehaviorstohelptheircus-tomerstofindproductsforpossiblepurchases[4],[52].ForcapturingtheuserbehaviorintheECenvironment,astudyonefficientminingWebtransactionpatternswasreportedin[64].Notethatbytreatingthecellsasotheritemsinthepatterns,onemayextendalgorithmGSP,whichwasdesignedforminingconventionalsequentialpatternsin[55],tofindmobilesequen-tialpatterns.However,suchanextensiontoalgorithmGSPisnotdeemedidealforminingmobilesequentialpatternsfortworeasons.First,sinceourapproachesprocessthecellsandtheitemsindividually,andthemodifiedGSPtreatsthecellasan-otheriteminthepatterns,itisexpectedthattheformerwillhaveasmallerdomaintoprocess,therebyhavingbetterefficiency,thanthelatter.Moreimportantly,bysimplytreatingthecellasanotherattribute,GSPisnotabletoutilizetheintrinsicrela-tionshipbetweenmovingandpurchasebehaviorsofcustomers,thusnotattainingtheminingefficiencywecouldhaveowingtothenatureofthisproblem.
C.ProcedureforMiningMobileSequentialPatternsWiththeaggregateconceptofminingonassociationrules,pathtraversalpatternsandsequentialpatterns,theproblemofminingmobilesequentialpatternscannotbesolvedbyasimpleadditionofpriortechniquessincefactorsinthesecompanionminingcapabilitiesareinfactentangled.Thisfactjustifiesthenecessityofdevisinganewminingprocedureformobilese-quentialpatterns.Asthemobilecommercebusinesshasbeenidentifiedbyseveralleadingindustrialcompaniesasthekeydirectiontomoveforyearstocome,itisbelievedthatmin-ingmobilesequentialpatternshasbecomeaverytimelyandimportantissuetoaddress.
TheflowchartforthewholeprocedureisshowninFig.3andthemeaningsofsymbolsaregiveninFig.4.Intheoverallprocedure,theproposedmethodsforminingmobilesequentialpatternsisoutlinedasfollows.
282IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.4.Meaningsofsymbolusedinminingmobilesequentialpatterns.
ProcedureMSP(MobileSequentialPatterns):
1)Large-TransactionGenerationPhase:Determinethe(L-transactionslargetransactions)fromthemobiletrans-actionsequences.
2)Large-TransactionTransformationPhase:EmployalgorithmLarge-TransactionTransformationwithSequence-Trimming(LTTST)totransformallmobiletransactionsequencesintothemaximalL-transactionsequences.
3)Sequential-PatternGenerationPhase:Employoneofthefollowingthreealgorithms[TJLS(TransactionsetJoinwithLarge-transactionset),TJPT(TransactionsetJoinwithPathTrimming),andTJPF(TransactionsetJoinwithPatternFamily)]todeterminethelargesequentialpatternsfromthemaximalL-transactionsequences.
4)Sequential-RuleGenerationPhase:Derivemobilese-quentialrulesfromthelargesequentialpatterns.
1)Large-TransactionGenerationPhase:Foreachcell,weapplyamodifiedalgorithmDHP[43]forfindingthesetofallL-transactionsTL.Similarlytotheapproachtakenby[7],thesetoflitemsetsismappedtoasetofcontiguousintegersforreducingthetimerequiredtocheckifamobilesequentialpatterniscontainedinamobiletransactionsequence.Notethatweareabletosimultaneouslydiscoverthesetofalllarge1-sequentialpatterns,sincethissetismainly{x:C|x∈TL,Cisthecellcontainingitemsetx}.
Example2.1:AnillustrativedatabaseforthisproblemisshowninFig.5,whereSequenceIDentification(SID)100isthemobiletransactionsequenceshowninFig.1.FortheexampledatabaseinFig.5,theL-transactionsareshowninFig.6(a).FortheL-transactionsshowninFig.6(a),afterthemappingshowninFig.6(b),thesetoflarge1-sequentialpatternsisshowninFig.6(c).
2)Large-TransactionTransformation(LTT)Phase:AswillbeseeninSectionIII,weneedtorepeatedlydeterminewhichpartofagivensetoflargesequentialpatternswillappearinthemobilesequentialpatterns.Forefficientlyminingthepatterns,weemployalgorithmLTTSTtotransformeachmobiletrans-
Fig.5.IllustrativeexampledatabaseDMthatstoressixmobiletransactionsequences.
Fig.6.Mappingtableshownin(b)mapsthelargetransactionsin(a)tothelarge1-sequentialpatternsin(c).
actionsequenceintoamaximalL-transactionsequenceinthisphase.
Example2.2:WiththemobiletransactionsequenceshowninFig.1andthemappingtableshowninFig.6(b),Fig.7illustratestheoperationsinalgorithmLTTST.InFig.7,thefirstcolumncorrespondstothesequenceofmovements,thesecondcolumncontainsthenodesvisitedandthethirdcolumnhastheitemspurchasedinSID100.Thefourthcolumngivestheon-goingL-transactioninthebufferandthefifthcolumngivestheon-goingstringinthebuffer.ThesixthcolumnshowstheL-transactionsetandtheseventhcolumnshowsthepathofthemaximalL-transactionsequencegeneratedbyLTTST.
Notethatthesameitemsetsindifferentcellsareviewedasdifferenttransactions.Thus,thesamelitemsetssoldindifferentcellswillbetransformedtodifferentintegers.
Example2.3:Forexample,transactionsA;{iB;{ig,ih},g,ih},andC;{ig,ih}allhaveitemset{ig,ih}.Inad-dition,{ig,ih}isalitemsetinbothcellsAandBbutisnotincellC.Afterthisphase,{ig,ih}incellAand{ig,ih}incellBaretransformedtodifferentintegers(say,tyandtz)whereas{ig,ih}incellCwillbetrimmed.ForthedatabaseDMshown
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT283
Fig.7.Exampleforproducingthemaximallargetransactionsequences.
inFig.5,thetransformeddatabaseDT,storingmaximalL-transactionsequences,isshowninthefirsttableofFig.8forillustrativepurposes.
3)Sequential-PatternGenerationPhase:AfterallthemobiletransactionsequencesaretransformedtomaximalL-transactionsequences,threealgorithms(algorithmTJLS,algorithmTJPT,andalgorithmTJPF)aredevisedformininglargesequentialpatternsfromthetransformeddatabaseDT.Alargek-sequentialpatternisrepresentedas{x1,x2,...,xk}:n1n2...ny,wherexjisanL-transactionmadealongthepath{n1n2...ny}.ThedetailsofalgorithmsofthisphasewillbedescribedinSectionIII.
Example2.4:Thelargesequentialpatterns,generatedinthesequential-patterngenerationphasefromtheexampledatabaseDT,areshowninFig.8.Forexample,{A;t1,C;t3,F;t4}:ABCDEFisonelarge3-sequentialpattern,whoseL-transactionsetandpathappearinSID100,SID200,SID400,andSID500.Thesupportisthus4.
4)Sequential-RuleGenerationPhase:Afterthesequential-patterngenerationphase,wecanfindthemobilesequentialrulesfromthelargesequentialpatternsinthisphaseinastraightfor-wardmanner.Unliketheassociationrule[6],themobilese-quentialrule,derivedfrommobilesequentialpatternsinthispaper,isanimplicationoftheformX=⇒Z:n1,n2,...,ny,whereXandZarebothsetsofL-transactions,X∩Z=Φ,and{n1,n2,...,ny}⊆N.TheruleX=⇒Z:n1n2...nyhassupportsifthenumberofmobiletransactionsequencesinDMcontainingX∪Z:n1n2...nyiss.Also,theruleX=⇒Z:n1,n2,...,nyholdswithconfidencecifc%ofmobiletransactionsequencesinDMthatcontainXalsocon-
tainZalongthepath{n1,n2,...,ny}.Explicitly,support(X=⇒Z:n1n2...ny)=support(X∪Z:n1n2...ny),andconfidence(X=⇒Z:n1,n2,...,ny)=(X∪Z:n1n2...nh...ny)/(X:n1n2...nh}).
Example2.5:Forexample,supposethat{A;t1,C;t3,F;t4}:ABCDEFisonelarge3-sequentialpat-ternwithsupport=4and{A;t1,C;t3}:ABCisonelarge2-sequentialpatternwithsupport=5.Then,wecanderiveonemobilesequentialrule{A;t1,C;t3}=⇒{F;t4}:ABCDEFwiththesupportequaltosupport({A;t1,C;t3}=⇒{F;t4}:ABCDEF)=
C;t3,F;t4}:ABCDEF)support({A;t1,
=4andtheconfidence({A;t1,C;t3}=⇒
ABCDEF)=(support({A;t1,C;t3,{F;t4}:
F;t4}:ABCDEF))/(support({A;t1,C;t3}:ABC)}=80%.
III.ALGORITHMSFORMININGMOBILE
SEQUENTIALPATTERNS
OncethedatabasecontainsmaximalL-transactionsequencesforallmobileusers,wecanderivethelargesequentialpatternsbyidentifyingthefrequentlyoccurringtransactionsequences.LetSkbethesetoflargek-sequentialpatterns,Rkbethesetofcandidatek-L-transactionsets,andCkrepresentthesetofcan-didatek-sequentialpatterns.RkisthetransactioncomponentofCk,andSkisasubsetofCk.Byhavingdifferentprioritiesonthefactorsinvolvinglargeitemsets,traversalpathsandordersofpurchases,wedevisethreealgorithms(algorithmTJLS,algo-rithmTJPT,andalgorithmTJPF)todeterminelargesequential
284IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.8.Largesequentialpatternsgeneratedinsequential-patterngenerationphasefromtheexampledatabaseDT.
patterns.BecausebothalgorithmTJPTandTJPFgenerateSkalongwiththegenerationofCk+1,weuseroundktorefertotheprocedureperformedtoobtain(Sk,Ck+1).ForalgorithmTJLS,weuseroundktorefertotheprocedureperformedtoobtain(Sk,Rk+1).NotethatS1isobtainedinthelarge-transactiongenerationphase,wethususeroundonetorefertotheprocedureperformedtoobtain(R2).Thesealgorithmsaredevisedstepbystepinlightofthefeaturesofthecandidategenerationofsequentialpatternsandareoutlinedasfollows.GeneralizedDescriptionsofAlgorithms:
1)AlgorithmTJLS:Byderivingastraightforwardextensionfrompriorworks,algorithmTJLSisdevisedasavariantofalgorithmaprioriin[6]byusingatwo-levelhashtreeinmininglargesequentialpatterns.
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT285
2)AlgorithmTJPT:Inlightoftheconceptofthepathtrimmingtechnique,algorithmTJPTisdevisedbytakingthepathintoconsiderationingeneratingthecandidatepatterns.
3)AlgorithmTJPF:Inlightoftheconceptofthepatternfamilytechnique,algorithmTJPFisdevisedbyusingtheshared-pathtreeingeneratingthecandidatepatterns.A.AlgorithmTJLS(TransactionsetJoinWithLarge-
TransactionSet)
AlgorithmTJLSisavariantofalgorithmaprioriin[6].Algo-rithmTJLSessentiallyutilizestheconceptofjoiningitemsetsinassociationrulemining[6],[55]whilesolvingthediscrepancybetweenlargesequentialpatternsandlargeitemsets.Similarlytoalgorithmapriori[6],TJLSjoinstheL-transactionsetsoflarge(k−1)-sequentialpatternsforthegenerationofcandidatek-L-transactionsetsintheproceduretodiscoverlargesequentialpatterns.However,unlikealgorithmapriori,TJLSemploysatwo-levelhashtree,calledthemobilesequencetree,tostorethecandidatesequentialpatterns.Byutilizingthetwo-levelhashingtechnique,TJLScanjointheL-transactionsetstoconstructthetransactioncomponentofthemobilesequencetreeinthecan-didategeneration.Then,inthedatabasescanforcountingthesupport,TJLSconstructsthepathcomponentbyextractingthecorrespondingpathfromthemaximalL-transactionsequenceswhoseL-transactionsetscontainthecorrespondingcandidateL-transactionsets.
Inthetwo-levelhashtree,anodeeithercontainsalistofpatterns(aleafnode)orahashtable(aninternalnode).Inaninternalnode,eachbucketofthehashtablepointstoanothernode.Thepatternsarestoredintheleafnodes.Therootofthehashtreeisdefinedtobeatdepth1.Aninternalnodeatdepthdpointstonodesatdepthd+1.WhenTJLSaddsapatternp,TJLSstartsfromtherootandgodownthetreeuntilreachingaleaf.Ataninternalnodeatdepthdinthetransactioncomponent,TJLSdecideswhichbranchtofollowbyapplyingahashfunctiontothedthL-transactionoftheL-transactionsetofpatternp.Similarly,ataninternalnodeatdepthginthepathcomponent,TJLSdecideswhichbranchtofollowbyapplyingahashfunctiontothegthcellofthepathofpatternp.
InthebeginningofhashingamaximalL-transactionsequencem,TJLSfindsallthecandidatesequentialpatternscontainedinmasfollows.IfTJLSreachesaninternalnodebyhashingtheL-transactionl(cellc),ithashesoneachL-transaction(cell)thatcomesafterl(c)inmandrecursivelyappliesthisproceduretothenodeinthecorrespondingbucket.IfTJLSreachesaleafnode,itfindswhichofthepatternsintheleafnodearecontainedinmandaddssupportcountstothem.
Example3.1:Fig.8isthelargesequentialpatternsgener-atedinsequential-patterngenerationphasefromtheexampledatabaseDT,andFig.9isthemobilesequencetreestoringS4inFig.8.
Forminingmobilesequentialpatterns,thefirstroundisex-ecutedwithlarge-transactiongenerationphasetoobtainS1,thesetoflarge1-sequentialpatterns,asshowninFig.6(c).In
addition,algorithmTJLSutilizestheL-transactionsinS1astheseedsetforgeneratingR2,thesetofcandidate2-L-transactionsets,whichisstoredinthetransactioncomponentofamo-bilesequencetree.Inthesecondround,TJLSconstructsthecompletemobilesequencetreebyhashingeachcombinationof2-L-transactionssetineachmaximalL-transactionsequenceintothetransactioncomponentandhashingthecorrespondingpathforconstructingthepathcomponent,tocountthesupportofeachcandidatesequentialpattern.Then,TJLSdestructsthemo-bilesequencetreeforderivingS2,thesetoflarge2-sequentialpatterns,andutilizestheL-transactionsetsinS2forgenerat-ingR3.Ineachsubsequentround,TJLSstartswithcandidateL-transactionsetsfoundinthepreviousroundforthecountingofsupportsofcandidatesequentialpatternsandthenidentifieslargesequentialpatterns.TJLSproceedstothegenerationofnewcandidateL-transactionsetsandstoresthemtothemobilesequencetree.Theprocedurecontinuesuntilnolargesequentialpatternsarederived.
Example3.2:ToillustratetheoperationsofalgorithmTJ,F;tLS,itcanbeseenfromFig.9,{A;t1,C;t34,G;t5}isacandidate4-L-transactionsetgeneratedbyjoin-ingtheL-transactionsets{A;t{A;t,C;t1,C;t3,F;t4}and13,G;t5},respectively,fromlarge3-sequentialpatterns{A;t{A;tC;t1,C;t3,F;t4}:ABCDEFand1,3,G;t5}:ABCDEFG.Inscanningdatabasephase,algorithmTJLSconstructsthepathcomponentofthemobilesequentialtreeandcountssupportsofthecan-didatesequentialpatterns.Forexample,afterthetransac-tioncomponentoftreeisconstructedintheFig.9(a),TJLSscansthedatabaseDTinFig.8toobtainthepathcompo-nentinFig.9(b)whilealsocountingsupports.InSID100,whenthesupportforcandidateL-transactionset{A;tC;ttcounted,thecorresponding1,3,F;4,G;t5}isbeingpathABCDEFGwillbegeneratedinthepathcomponentofthemobilesequencetreetoaccountforonesupportcountof{A;t1,C;t3,F;t4,G;t5,}F;:ABCDEFGt.Hence,thefi-nalsupportof{A;t1,C;t34,G;t5}:ABCDEFGis2,i.e.,fromSID100andSID200.Explicitly,thecor-respondingpathisdividedintoseveralsubpathsbyidenti-fyingthecellsofL-transactions.FortheexampleshowninFig.10,algorithmTJ{A;t,P;tLScountsthesupportofL-transactionset1,C;t37}inSID100.TJLSfirstlocatesA;t1onposition(1)andC;t3onposition(2)intheL-transactionset,andthecorrespondingsubpathABCisextractedfromthesubpathABCshownin(3).Then,TJLSlocatesC;t3onposition(2)andP;t7onposition(4)intheL-transactionset,andthecorrespondingsubpathCDEFGLPisextractedfromthesubpathCDEFGHQGLPshownin(5).NotethattheredundancyofHQGiseliminatedbecausetheycauseacyclebetweenL-transactionC;t3andL-transactionP;t7.Afterscanningdatabaseforcountingthesupportofcandi-datesequentialpatterns,TJLSobtainslargesequentialpat-ternsintheprocedureofdestructingthemobilesequencetree.Eachlargesequentialpatternisgeneratedwhenitssup-portexceedstheminimumsupport.Forexample,onecande-structthemobilesequencetreeinFig.9todetermineS4inFig.8.
286IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.9.Datastructureofamobilesequentialtreeforstoringcandidate4-sequentialpatternsinalgorithmTJLS.
Fig.10.ProcedureofcountingsupportofL-transaction{A;t1,C;t3,P;t7}:ABCDEFGLPinSID100.
set
B.AlgorithmTJPT(TransactionsetJoinWithPathTrimming)Withoutexploitingthepathsoflargesequentialpat-terns,algorithmTJLStendstocountthesupportsofalotofout-of-pathsequentialpatterns(i.e.,thesequentialpatternsthatdonotstaywithinthepath),thusdegradingtheperformance.Inlightoftheconceptofpathtrimming,algorithmTJPTisdesignedbytakingboththeL-transactionsetsandpathsoflargesequentialpatternsintoconsiderationtogeneratecandidatesequentialpatterns.Explicitly,duringthegenerationoflargesequentialpatterns,bydestructingthemo-bilesequencetree,TJPTnotonlydetermineslargesequentialpatternsbutalsomaintainsabufferthatcontainstheleafnodesinthetransactioncomponentandthecorrespondingpathsinthepathcomponentsoastoclassifythepatterns.Thepurposeofclassifyingthepatternsisthatthepatterns,whosepathsdonot
containeachother,neednotbeconsideredtogeneratecandidatesequentialpatternstogether.Thus,TJPTcantrimthegenerationofcandidatesequentialpatternsaccordingtothepaths.Thisisreferredtoasthepathtrimmingtechnique.Asaresult,TJPTuti-lizeslargesequentialpatternstogeneratecandidatesequentialpatternsinthecandidategenerationforsolvingtheout-of-pathsequentialpatternprobleminTJLSmentionedabove.
Example3.3:FortheexampleshowninFig.8,inSID300,whenthesupportforcandidate4-L-transactionset{A;t1,C;t3,F;t4,G;t5}isbeingcounted,thecorrespondingpathAWBCEFGwillbegeneratedinthepathcomponentofmobilesequencetreetoaccountforonesupportcountfor{A;t1,C;t3,F;t4,G;t5}:AWBCEFG.Notethat{A;t1,C;t3,F;t4,G;t5}:AWBCEFGhasfoursubpatternsincluding{A;t1,C;t3,F;t4}:AWBCEF,{A;t1,C;t3,G;t5}:AWBCEFG,{A;t1,F;t4,G;t5}:AWBCEFG,and{C;t3,F;t4,G;t5}:CEFG.However,allofthemarenotlarge3-sequentialpatterns.Instead,theyareout-of-path3-sequentialpatternsinround3inthesensethatnotalloftheirsubpatternsarelarge2-sequentialpatterns.Explicitly,only{F;t4,G;t5}:FGisalarge2-sequentialpatterninthiscase.However,algorithmTJLSstillcountsthesupportsoftheminround3.InalgorithmTJLS,out-of-pathsequentialpatternswillbegeneratedineachroundifthecandidateL-transactionsetsarecontainedintheL-transactionsetsofmaximalL-transactionsequencesinDT.Forexample,theout-of-pathsequentialpatternsgeneratedby
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT287
C.AlgorithmTJPF(TransactionsetJoinWithPatternFamily)AlgorithmTJPFissimilartoalgorithmTJPTinthatitem-ploystheconceptofutilizinglargesequentialpatternsforgener-atingcandidatesequentialpatternstoreducethecomputationaloverheadcausedbyout-of-pathsequentialpatternsbutisdif-ferentfromthelatterinthatalgorithmTJPFbyutilizingtheinformationinpatternsandisabletoreducethenumberofuncertaincandidatesequentialpatternsandstorecandidatese-quentialpatternswithacompactapproach,thusfurtherreducingthecorrespondingoverhead.RecallthatalgorithmTJPTutilizespathtrimmingtechniqueforthegenerationofcandidatesequen-tialpatternsbycomparingthepathsofitssubpatternstoidentifyifonepathcontainsanother.Example3.6:FortheexampleinFig.12,algorithmTJPTgeneratesthecandidate5-sequentialpatterninFig.12(a)byjoiningL-transactionset{A;t1,C;t3,F;t4,G;t5}insubpattern1andL-transactionset{A;t1,C;t3,F;t4,Q;t6}insubpat-tern2withthepathtrimmingtechniquetoidentifythatpathABCDEFGHQcontainspathABCDEFG.
ComparingthepathsofsubpatternsincursO(|P|)computa-tion,where|P|istheaveragepathlengthoflargesequentialpatterns.Inaddition,algorithmTJPTisrequiredtostorethesamepathsasthebranchesindifferentsubtreesofthetrans-actioncomponentinthemobilesequencetree,whichincursanexcessiveuseofmemory.Notethatevenbytreatingthecellsasotheritemsinthepatterns,modifiedalgorithmGSPstillneedstocomparethewholecellsinthepathandincursO(|P|)computation.Hence,algorithmTJPFsurpassesalgorithmTJPTandthemodifiedGSPinthatwiththepatternfamilytechnique,TJPFisabletogenerateamorecompacttreetostorethepatternstominimizethecorrespondingoverhead.1)RemarksofAlgorithmTJPF:AlgorithmTJPFisdevisedinlightofthepatternfamilytechnique.Tofacilitateourdescrip-tionofalgorithmTJPF,sometheoreticalpropertiesofpatternfamilyaredevisedbelow.
Definition1:Amaximalsequentialpatternisalargesequen-tialpatternthatisnotcontainedinanyotherlargesequentialpattern.Foreachmaximalsequentialpattern,itspatternfamilyconsistsofthepatternitselfandallitssubpatternsgeneratedineachround.
Example3.7:FortheexampleshowninFig.8,oneofthemaximalsequentialpatternsis{A;t1,C;t3,F;t4,G;t5,Q;t6}:ABCDEFGHQwhichisalsoalarge5-sequentialpattern.ThecorrespondingpatternfamilyisshowninFig.13.
Definition2:ForapatternfamilywhosemaximalsequentialpatternisskwhichconsistsofL-transactions{x1,x2,...,xk}andpathm1m2...mq,amaximal-pathlarge2-sequentialpattern(abbreviatedlyasMS2),{x1,xk}:m1m2...mq,isalarge2-sequentialpatternwhichhasthesamepathasthemaximalsequentialpatternofthispatternfamily.
Example3.8:Foreachpatternfamily,itisnotedthatMS2isthelarge2-sequentialsubpatternwiththemaximalpath.FortheexampleshowninFig.13,patternsmarkedgrayarethepatternswithMS2={A;t1,Q;t6}:ABCDEFGHQ,whichisthe
Fig.11.Examplefordescribingtheout-of-pathsequentialpatternprobleminSID300causedbyalgorithmTJLS.
SID300areshowninFig.11.Suchanout-of-pathsequentialpatternproblemwillhappeninroundkfork>2.Thisinturnimpliesthatonecantrimthesupportcountingoftheredundantsequentialpatternsaccordingtothepathstraversed.
RecallthatSkrepresentsthesetoflargek-sequentialpatternsandCkisthesetofcandidatek-sequentialpatterns.Inthecan-didategenerationphase,TJPTconstructsboththetransactionandpathcomponentsofmobilesequentialtreeforstoringCk.Inthecandidategeneration,TJPTjoinstheL-transactionsetsoflarge(k−1)-sequentialpatternsforthegenerationofcandidatek-L-transactionsetandcomparesthepathsoflarge(k−1)-sequentialpatterns.Ifonepathdoesnotcontaintheotherpath,thegeneratedcandidatek-L-transactionsetistrimmed.Ifonepathpcontainstheotherpathq,TJPTgeneratesacandidatek-sequentialpatternsconsistingofthecandidatek-L-transactionsetandp.
Example3.4:ConsidertheexamplescenarioshowninFig.12.InalgorithmTJPT,thecandidate5-sequentialpattern{A;t1,C;t3,F;t4,G;t5,Q;t6}:ABCDEFGHQinFig.12(a)isgeneratedbyjoiningL-transactionset{A;t1,C;t3,F;t4,G;t5}insubpattern1andL-transactionset{A;t1,C;t3,F;t4,Q;t6}insubpattern2withthepathtrimmingtechniquetoidentifythefactthatpathABCDEFGHQcontainspathABCDEFG.Finally,{A;t1,C;t3,F;t4,G;t5,Q;t6}:ABCDEFGHQisqualifiedasacandidate5-sequentialpatternafterTJPTidentifiesthattheothersubpatterns(i.e.,subpatterns3,4,and5)arelarge4-sequentialpatternsinFig.12(b).
Byclassifyingthelargek-sequentialpatterns,TJPTcanefficientlygeneratecandidatek-sequentialpatterns.Partic-ularly,byclassifyingthepatternsinSkfork≥2,TJPTwillnotgenerateanyout-of-path(k+1)-sequentialpattern.ThisdemonstratestheveryadvantageofthepathtrimmingtechniqueTJPTemploys.
Example3.5:FortheexampleshowninFig.9,TJPTgen-eratesthecompletemobilesequencetreebyhashingnotonlyL-transactionsetsbutalsopathsincandidategenerationsothatthepathAWBCEFGwillnotbecountedforthesupport.Notethatsuchout-of-pathsequentialpatternsastheoneshowninFig.11willnotoccuranymore,showingasignificantperformanceimprovementofTJPToverTJLS.
288IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.12.Candidate5-sequentialpatternshownin(a)isgeneratedbyidentifyingtheexistenceofitsfivelarge4-sequentialsubpatternsshownin(b).
Fig.13.Onepatternfamilyexample.Patternsmarkedgrayarethepatternshavingthemaximal-pathlarge2-sequentialpattern.
large2-sequentialpatternwithpathlengthequalto9,largerthanthoseofotherlarge2-sequentialpatterns.
Definition3:Supposesk,k≥3,isthemaximalsequen-tialpatternofapatternfamily,andskconsistsofL-transactions{x1,x2,...,xk}andpathm1m2...mq.Thecentro-subtransactionsetofapatterninthispatternfamilyis{x2,...,xk−1}.
Example3.9:Forexample,thelarge4-sequentialpat-ABCDEFGHQtern{A;t1,C;t3,F;t4,Q;t6}:
canbeviewedastwoparts,i.e.,MS2={A;t1,Q;t6}:ABCDEFGHQandthecentro-subtransactionsetbeing{C;t3,F;t4}.Notethatalargek-sequentialpatternisapatternconsistingofkL-transactionsandapath.Foreachlargek-sequentialpattern,allitsk(k−1)-sequentialsubpatternsarelarge.Explicitly,foralargek-sequentialpatternwithL-transaction{x1,x2,...,xk},theL-transactionsofitsk(k−1)-sequentialsubpatternscanberepresentedby{x2,x3,...,xk},{x1,x3,...,xk},...,and{x1,x2,...,xk−1}.Then,wehavethefollowingremarks.Remark1:Foralargek-sequentialpatternpk,k≥3,thereexistatleastk−2large(k−1)-subpatternswhosepathsareidenticaltothatofpk.
Example3.10:Forthelarge5-sequentialpattern{A;t1,C;t3,F;t4,G;t5,Q;t6}:ABCDEFGHQshowninFig.13(a),thereexist3large4-subpatterns,
C;t3,F;t4,Q;t6}:ABCDEFGHQ,{A;t1,
{A;t1,C;t3,G;t5,Q;t6}:ABCDEFGHQ,and{A;t1,F;t4,G;t5,Q;t6}:ABCDEFGHQ,showninFig.13(b),whosepathsarethesamewiththeoneofthelarge5-sequentialpattern.
Remark2:Notethatamaximalsequentialpatternisalsoalargesequentialpattern.Thus,foramaximalsequen-tialpatternskwithL-transactions{x1,x2,...,xk}andpathm1m2...mq,thereexistk−2large(k−1)-sequentialsub-patternswhichhaveidenticalmaximal-pathlarge2-sequentialpattern{x1,xk}:m1m2...mq.
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT289
Fig.15.AlgorithmTJPFjoinsthecentro-subtransactionsetswithintegercomparison.
Fig.14.AlgorithmTJPFhashesthelastL-transactionfirstsothatthecentro-subtransactionsetisidentifiedandcomparestheindividualintegersstoredinshared-pathtree.
2)AlgorithmTJPFUsingShared-PathTreeinCan-didateGeneration:AlgorithmTJPFisabletogenerateamaximalsequentialpatternskwithL-transactions{x1,x2,...,xk}andpathm1m2...mqasacandidatesequentialpatternbyjoiningthepreviouslargesequentialpatterns{x1,(x2,x3,...,xk−3,xk−2),xk}:m1,m2,...,mqand{x1,(x2,x3,...,xk−3,xk−1),xk}:m1,m2,...,mqwiththepatternfamilytechnique.Explicitly,TJPFobtains:1){x2,x3,...,xk−3,xk−2,xk−1}byjoiningthecentro-subtransactionsets{x2,x3,...,xk−3,xk−2}and{x2,x3,...,xk−3,xk−1}and2)thenewMS2={x1,xk}:m1,m2,...,mqbycomparingtheMS2’sinthepreviouslargesequentialpatterns.ByhashingthelastL-transactionfirstandstoringanintegerforeachpath,TJPFconstructsthemobilesequencetreewithaformthatforeachcandidatesequentialpatterns,twoL-transactionsofMS2comefirst,acentro-subtransactionsetisinthemiddle,andaninteger,whichisreturnedfromshared-pathtreeforindexingthecorrespondingpath,isintheleaf.
Example3.11:Forexample,TJPFstoresthecan-didatesequentialpatterns{A;t1,C;t3,F;t4,Q;t6}:ABCDEFGHQand{A;t1,C;t3,G;t5,Q;t6}:ABCDEFGHQinFig.12(b)intothemobilese-quencetreeasinFig.14(a).TJPFhashesthelastL-transaction,Q;t6,inthefirstpositionofthemobilesequencetree.NotethatpathABCDEFGHQisrepresentedbytheintegerp3,derivedfromthemappingoftheshared-pathtreeinFig.14(b).Then,TJPFcanjointheL-transactionsetsofthesetwolarge4-sequentialpatterns,i.e.,pattern2and3showninFig.15(b),withabuffertokeepablockthatcontainstheleafnodesinthetransactioncomponentandthecorrespondingintegersinthepathcomponenttoclassifythepatternsforgeneratingthecandidate5-sequentialpatterninFig.12(a)efficiently.
FromRemark2,weknowthatTJPFjoinsSkforgeneratingCk+1withthepatternfamilytechnique,andthepathsofalllarge
sequentialpatternstobejoinedareidenticaltooneanotherinroundk,k≥3.Thus,TJPFjoinsthecentro-subtransactionsetswithcomparingindividualintegersforachievingperformanceimprovement.
Example3.12:FortheexampleshowninFig.15,TJPFjoinsthecentro-subtransactionsetsinpattern4and5showninFig.15(c)withcomparingintegerswhichareequaltoeachother(i.e.,p3)togeneratepattern2inFig.15(b).Similarly,TJPFjoinsthecentro-subtransactionsetsinpattern2and3showninFig.15(b)bycomparingintegerstogeneratepattern1inFig.15(b).
ThemethodforalgorithmTJPFtoreducecomputationaloverheadandmemoryconsumptionisasfollows.Inthefirstround,algorithmTJPFalsojoinstheL-transactionsinS1forgeneratingR2tobestoredinthetransactioncomponentofamo-bilesequencetree.However,inthesecondround,TJPFhasheseachcombinationof2-L-transactionssetineachmaximalL-transactionsequenceintothetransactioncomponentbyhashingthelastL-transactionfirst.Then,TJPFhashesthecorrespond-ingpathintotheshared-pathtreewhichhasanassignedintegerineachleafnodeforrepresentingthepathfromtherootnodetotheparentnodeofthatleafnode.TJPFnextreturnstheinte-gerforconstructingthepathcomponentofthemobilesequencetreewhilekeepingcountingthesupport.Afterthecandidate2-sequentialpatternswiththeminimumsupportareidentifiedasthelarge2-sequentialpatterns,algorithmTJPFjoinstheL-transactionsetswiththepathtrimmingtechniquetogeneratecandidate3-sequentialpatterns.Notethattheshared-pathtreeconstructedinroundtwowillbeusedforthemappingbetweenpathsandintegersinthefollowingrounds.
Example3.13:Forexample,theshared-pathtreeshowninFig.14(b)mapsthepathsoflarge4-sequentialpatternsinFig.12(b)intointegers{p1,p2,p3,p4,p5}.
Inthethirdround,TJPFcountsthesupportsofcandidate3-sequentialpatternsbyhashingintothemobilesequencetreetheL-transactionsetsandintegersreturnedfromtheshared-pathtree.Indestructingthemobilesequencetree,TJPFnotonlydetermineslarge3-sequentialpatternsbutalsousesabuffer
290IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.16.UncertaincandidatetransactionpatternsgeneratedbyTJTJPTand
PF.
tokeepablockthatcontainstheleafnodesinthetransactioncomponentandthecorrespondingintegersinthepathcompo-nenttoclassifythepatterns.ByhashingthelastL-transactionfirst,themobilesequencetreeiswell-structuredandTJPFcom-parestheindividualintegerstotrimthegenerationofcandidatesequentialpatternsaccordingtothepaths.Ineachsubsequentround,TJPFconstructsthemobilesequencetreebyhashingthelastL-transactionfirstandutilizingtheshared-pathtreeformappingsothatTJPFcomparestheindividualintegersintrimmingthegenerationofcandidatesequentialpatterns.Al-gorithmTJPFthushasO(1)executiontimecomplexityinthisstep,betterthanO(|P|)byalgorithmTJPT,where|P|istheaveragepathlengthoflargesequentialpatterns.Inaddition,un-likealgorithmTJPT,algorithmTJPFutilizesMS2tofilteroutsomeuncertaincandidatesequentialpatternsbeforesubpatternidentification.Foracandidatek-sequentialpatterns,itshouldhaveklarge(k−1)-sequentialpatterns.Foranuncertaincan-didatek-sequentialpatterns,itisgeneratedbyjoiningtwolarge(k−1)-sequentialpatterns.Thus,forprovingthatanuncer-taincandidatek-sequentialpatternsisqualifiedasacandidatek-sequentialpatterns,therearek−2subpatternidentificationstheneedtobeconducted.
Example3.14:Forexample,takingthelarge4-sequentialpatternsshowninFig.8,TJPTgeneratesfouruncertaincan-didate5-sequentialpatternsshowninFig.16(a).However,byutilizingthepatternfamilytechnique,TJPFonlygeneratesoneuncertaincandidate5-sequentialpatternshowninFig.16(b).Thus,TJPTconducts12subpatternidentificationsandTJPFconductsthreesubpatternidentifications.ThisdemonstratestheveryadvantageofthepatternfamilytechniqueTJPFemploys.IV.EXPERIMENTALRESULTS
ToassesstheperformanceofTJLS,TJPT,andTJPF,weconductedseveralexperimentstodeterminelargesequentialpatterns.Theseexperimentsareperformedonacomputerwitha1-GHzIntelCPUand512MBofmemory.ThemethodusedtogeneratesyntheticdataisdescribedinSectionIV-A.InSectionIV-B,performanceofTJLS,TJPT,andTJPFiscomparativelystudied.
A.GenerationofSyntheticMobileTransactionSequencesIntheexperiments,themovingscenariowithtransactionsmadeinamobilecommerceenvironmentissimulated.Sincethemobilecommerceserviceisanewapplicationinthenearfuture,webelievethatthecustomershavethesimilarbehaviorstothoseoftheminthecurrentdatanetworkwhentheyfirstusethisservice.Afterthisserviceisusedbycustomers,thebehav-
Fig.17.Parametersusedinthesimulation.
Fig.18.Meshnetworktosimulatemobilecommerceenvironment.
iorswillthenbechangedaccordingtotheirusageexperiences.Currently,thereisnorealscenariothatwecanmimic.Thus,inthispaper,thesimulationmodelforgeneratingsyntheticmobiletransactionsequencesisinfactsimilartothatinthecompanionpapers[43],[49].Explicitly,themethodforgeneratingmovingpatternsissimilartothatin[49]andthemethodforgeneratingtransactionsissimilartothatin[43].
Fig.17summarizesthemeaningsofvariousparametersusedintheexperiments.First,weconstructann×nmeshnetwork[40]withamodificationbytakingthegeographicboundaryintoconsiderationtolimitthenumberofneighborssoastomimicthemobileenvironment,whereeachnoderepresentsonecell[49].Thenumberofitemsineachcellisdeterminedfromauniformdistributionwithinagivenrange,denotedbynI.Foreachcell,theadvancingprobabilityPaofeachneighboristheprobabilityforacustomertomovetoneighboringcellstopurchasetheitemssoldthere.Inessence,eachdirectededgefromonecellAtoanothercellBisassignedwithaweight,correspondingtotheadvancingprobabilityofBforA.Inthemodel,theadvancingprobabilityisobtainedbytheratioofthenumberofitemssoldineachneighbortothosenum-bersofotherneighbors.Forthe3×3meshnetworkexampleshowninFig.18(a),therearefourneighborsN1,N2,N3,N4forcellYwiththecorrespondingadvancingprobabili-tiesPa1,Pa2,Pa3,Pa4.Inaddition,iN1,iN2,iN3,iN4arethenumbersofitemssoldincellsN1,N2,N3,N4andwehavePa1=(iN1)/(iN1+iN2+iN3+iN4)andPa2=(iN2)/(iN1+iN2+iN3+iN4).
Intheexperiments,|D|isthenumberofmobiletransactionsequencesgenerated.Whenacustomermovesamongcellsfor
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT291
Fig.19.(a)ExecutiontimeofalgorithmsTJLS,TJPT,andTJPFwhenminimumsupportvariesand(b)executiontimeofalgorithmsTJLS,TJPT,andTJPFwhennumberofmobiletransactionsequencesvaries.
shoppingintheMCenvironment,themobiletransactionse-quencecompletedbythiscustomerconsistsofamovingpathandasetoftransactionsmadeinthecorrespondingcells.Thestartingpositionofeachmobilesequentialpatterncanbeeithervistorlocationregister(VLR)orhomelocationregister(HLR)andisrandomlyselectedamongthesecells[34].Amovingpathconsistsofcellsmovedbyauser.ThesizeofeachmovingpathisdeterminedfromaPoissondistributionwithmeanequalto|P|.Whenacustomermovestoacell,theprobabilitythatthiscustomermakesthetransactioninthiscellisdenotedbyPb.NotethatthenumberofitemsineachcellisdeterminedfromauniformdistributionwithinagivenrangenI.Foreachcell,oncethenumberofitemsisdetermined,theitemsthatcouldbepurchasedineachcellarefixed.Themethodforgeneratingtransactiondataineachcellissimilartotheoneinthepriorwork[43].Inthemobilecommerceenvironment,peopletendtobuysetsofitemstogether,whicharealsocalledpotentialmaximalfrequentsets.Thesizeofthemaximalelementsisclusteredaroundameanwithafewlongitemsets.Atransactionmaycontainoneormoreofsuchfrequentsets.Thetransactionsizeisalsoclusteredaroundamean,whichisdenoted|T|.Theprobabilitythatauserwillmovefromthecurrentcellbacktothecellfromwhichhe/shecame,calledthebackwardweight,isdenotedbyP0,whichisequaltoPa×Pd,wherePdisadamp-ingfactorbecauseofthebackwardmovement.Withoutlossofgenerality,Pdissetto0.8inourexperiments.TheprobabilityofmovingtoeachneighborPmisalsodeterminedbythead-vancingprobabilityandthesumoftheweightsforallthesecellsisequalto1−P0.ForthemeshnetworkshowninFig.18(a),whenoneuservisitscellYfromcellN1,theprobabilitiesoftheneighborsthatthisuserwillmovetoareshowninFig.18(b).B.PerformanceComparison
Inthefollowingexperiments,weconstructan8×8meshnetworkandset|D|=200K,s=0.5%,nI=200,Pb=0.5,Pd=0.8,|T|=4,and|P|=20.
1)ExperimentOne:WhentheMinimumSupportVaries:Inthisexperiment,svariesfrom1.5%to0.25%.Fig.19(a)showsthatTJPTandTJPFingeneral,outperformTJLSforvariousminimumsupports.Withthepathtrimmingandthepatternfamilytechniques,bothTJPTandTJPFcangeneratefewercandidatesequentialpatternsthanTJLS,whichsuffersalotofout-of-pathsequentialpatternsineveryround.Astheminimumsupportdecreases,theexecutiontimesofallthealgorithmsincreasebecauseoftheincreasesinthetotalnumberofcandidateandlargesequentialpatterns.
2)ExperimentTwo:WhentheNumberofMobileTransactionSequencesVaries:Inthisexperiment,|D|variesfrom200to1000K.Fig.19(b)showsthattheexecutiontimesofTJPTandTJPFincreaselinearlyasthedatabasesizeincreases,indicatingthegoodscale-upfeatureofTJPTandTJPF.
3)ExperimentThree:WhenPurchasingProbabilityVaries:NotethatalgorithmTJLSsufferstheout-of-pathsequentialpat-ternproblem.Toaddressthisproblem,weconductthisex-perimentwiththepurchaseprobabilityPbvaryingfrom0.5to0.3,andtheresultisshowninFig.20(a).Foreachal-gorithm,itsexecutiontimeistakenasthebasepointwhenPbis0.5,andFig.20(a)showstheexecutiontimewhenPbvaries.Whenthepurchaseprobabilitydecreases,theexecutiontimesofallthealgorithmsdecreasebecauseofthedecreasesinthetotalnumberofcandidateandlargesequentialpatterns.However,thepathlengthsoftheout-of-pathsequentialpat-ternsincreasebecausetheaveragenumberofcellsvisitedpertransactionincreases.Notethatalthoughthetotalnumberofcandidateandlargesequentialpatternsdecreases,theout-of-pathsequentialpatternproblemcausesalgorithmTJLStostillcountthesupportsofnonlargesequentialpatterns.Asare-sult,whenPbdecreases,thedecreaseoftheexecutiontimeofTJLSisnotasprominentasthoseofTJPTandTJPF.Toprovidemoreinsightintotheperformancecomparisonsofalgo-rithms,itisshowninFig.21thatTJPTandTJPFoutperformTJLSindifferentdatabasesizes,whichindicatesthatTJPTandTJPFarerobustinthesensitivityanalysisofthepurchasingprobability.
4)ExperimentFour:WhentheAveragePathLengthVaries:Toexaminethesensitivityofvaryingtheaveragepathlength,|P|variesfrom10to30.TheresultisshowninFig.20(b).Foreachalgorithm,itsexecutiontimeistakenasthebasepointwhen|P|is10,andFig.20(b)showstheexecutiontimewhen|P|varies.ItcanbeseenthatTJPFislesssensitivetothevariationofpathlengththanTJPT.ThisagreeswiththefactthatTJPFhasO(1)executiontimeforcomparingthepathinthecandidategener-ationstage,whereasthecorrespondingcomplexityofTJPTis
292IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
Fig.20.(a)ExecutiontimeofalgorithmsTJLS,TJPT,andTJPFwhenpurchasingprobabilityvariesand(b)executiontimeofalgorithmsTJPTandTJPFwhenaveragepathlengthofmobiletransactionsequencesvaries.
Fig.21.ExecutiontimeofalgorithmsTJLS,TJPT,andTJPFwhenpurchasingprobabilityvariesindifferentdatabasesizes.
O(|P|).Inaddition,itisalsoshowninFig.22thatTJPFout-performsTJPTinthesensitivityanalysisoftheaveragepathlengthwithdifferentdatabasesizes.ToprovidemoreinsightintothecandidategenerationstageofTJPTandTJPF,itisshowninFig.23thattheratio(TJPT)/(TJPF)ofexecutiontimewhichisincurredbycomparingthepathisalmostequalto(O(|P|))/(O(1)).
5)ExperimentFive:PerformanceComparisonBetweenTJPTandTJPFinEachRound:Toprovidemoreinsightsintothesharedpathtreefeatureexploitedbypatternfamilytechnique,weset|D|=200000,s=0.5%,nI=200,Pb=0.5,Pd=0.8,|T|=4,and|P|=20andcomparetheperfor-manceofTJPTandTJPFineachround.BecauseS1isob-tainedinthelarge-transactiongenerationphase,wethususeroundonetorefertotheprocedureperformedtoobtain(R2)anduseroundtwotorefertotheprocedureperformedtoobtain(C2,S2,C3).NotethatTJPTandTJPFgenerateSkalongwiththegenerationofCk+1,weuseroundk,k≥3torefertotheprocedureperformedtoobtain(Sk,Ck+1).AsshowninFig.24(a),TJPFconsistentlyoutperformsTJPTinallrounds,exceptroundone.Thisagreeswithourintuition.Notethatinroundone,withoutanypathinformation,bothTJPTandTJPF
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT293
Fig.22.ExecutiontimeofalgorithmsTJPTandTJPFwhenaveragepathlengthofmobiletransactionsequencesvariesindifferentdatabasesizes.
Fig.23.Ratioofexecutiontimewhichisincurredbycomparingthepath.
jointheL-transactionsinS1forgeneratingR2tobestoredinthetransactioncomponentofamobilesequencetree.Inroundtwo,TJPTconstructsthepathcomponentofthemobilesequentialtreeforstoringC2.Inthefollowingrounds,whenTJPTstoresthepathinformationofCk,k≥3,TJPTstillneedstoconstructthepathcomponentofthemobilesequentialtreeforstoringCk.However,byutilizingthepatternfamilyrela-tionship,TJPFcanusetheshared-pathtreegeneratedinC2forindexingthepathinformationofCk,k≥3,inthefollow-ingrounds,leadingtomoreefficientexecution.Inaddition,toprovidemoreinsightsintoTJPFandTJPT,thenumbersofbranchesforstoringthepathinformationofCkareshowninFig.24(b).InTJPT,thesebranchesarestoredinthemobilese-quentialtreesforallrounds.InTJPF,thesebranchesarestored
Fig.24.PerformancecomparisonbetweenTJPTandTJPFineachround.(a)Executiontimeand(b)thenumberofpathsstored.
intheshared-pathtreegeneratedinroundtwo,andthus,theamountofmemorysavingsis25.8MB.
V.CONCLUSION
Inthispaper,weexploredadataminingcapabilitywhichinvolvesminingmobilesequentialpatternsforanMCenvi-ronment.Inessence,theminingofmobilesequentialpatternsaggregatestheconceptsofminingassociationrules(miningpathtraversalpatternsandminingsequentialpatterns)andthus
294IEEETRANSACTIONSONSYSTEMS,MAN,ANDCYBERNETICS—PARTC:APPLICATIONSANDREVIEWS,VOL.37,NO.2,MARCH2007
requiresacombineduseofcorrespondingtechniques.Byhav-ingdifferentprioritiesonthefactorsinvolvinglargeitemsets,traversalpaths,andordersofpurchases,wehavedevisedthreealgorithms(algorithmTJLS,algorithmTJPT,andalgorithmTJPF)fordetermininglargesequentialpatternsfrommobiletransactionsequences.TJLSisdevisedinlightoftheconceptofassociationrules,andTJPTisdevisedbytakingboththeconceptsofassociationrulesandpathtraversalpatternsintoconsideration.Byutilizingthepatternfamilytechnique,TJPFisabletogeneratelargesequentialpatternsveryefficiently.AsimulationmodelfortheMCenvironmentwasdeveloped,andasyntheticworkloadwasgeneratedforperformancestud-ies.Inourperformancestudy,theproposedalgorithmTJPFsignificantlyoutperformsothersinbothexecutionefficiencyandmemorysaving,indicatingtheusefulnessofthepatternfamilytechnique.Itisshownbyourresultsthatbytakingbothmovingpatternsandpurchasepatternsintoconsideration,onecanhaveabettermodelforanMCsystemandthusbeabletoexploittheintrinsicrelationshipbetweenthesetwoimportantcustomerbehaviorsfortheefficientminingofmobilesequentialpatterns.
REFERENCES
[1][Online].Available:http://www.mobilecommerceworld.com
[2]WalletApplication[Online].Available:http://www.forum.nokia.com.[3]R.Agarwal,C.Aggarwal,andV.V.V.Prasad,“Atreeprojectionalgorithm
forgenerationoffrequentitemsets,”J.ParallelDistrib.Comput.,2000.[4]C.C.Aggarwal,J.L.Wolf,K.-L.Wu,andP.S.Yu,“Theintelligent
recommendationanalyzer,”inProc.ICDCSInt.WorkshopofKnowledgeDiscoveryandDataMiningintheWorld-WideWeb,Taipei,Taiwan,Apr.10–13,2000,pp.F67–F72.
[5]R.Agrawal,T.Imielinski,andA.Swami,“Miningassociationrulesbe-tweensetsofitemsinlargedatabases,”inProc.ACMSIGMODInt.Conf.ManagementofData,May1993,pp.207–216.
[6]R.AgrawalandR.Srikant,“Fastalgorithmsforminingassociationrules
inlargedatabases,”inProc.20thVLDBConf.,Sep.1994,pp.478–499.[7],“Miningsequentialpatterns,”inProc.11thInt.Conf.DataEngi-neering,Mar.1995,pp.3–14.
[8]J.M.AleandG.H.Rossi,“Anapproachtodiscoveringtemporalassoci-ationrules,”inProc.ACMSymp.AppliedComputing,Como,Italy,Mar.2000,pp.294–300.
[9]A.M.Ayad,N.M.El-Makky,andY.Taha,“Incrementalminingof
constraintedassociationrules,”inProc.1stSIAMConf.DataMining,Chicago,IL,Apr.2001.
[10]Y.Bastide,R.Taouil,N.Pasquier,G.Stumme,andL.Lakhal,“Miningfre-quentpatternswithcountinginference,”ACMSIGKDDExplor.Newslett.,vol.2,no.2,pp.66–75,Dec.2000.
[11]M.-S.Chen,J.Han,andP.S.Yu,“Datamining:Anoverviewfrom
adatabaseperspective,”IEEETrans.Knowl.DataEng.vol.8,no.6,pp.866–833,Dec.1996.
[12]M.-S.Chen,J.-S.Park,andP.S.Yu,“Efficientdataminingforpath
traversalpatterns,”IEEETrans.Knowl.DataEng.,vol.10,no.2,pp.209–221,Apr.1998.
[13]X.ChenandI.Petrounias,“Discoveringtemporalassociationrules:Al-gorithms,languageandsystem,”inProc.16thInt.Conf.DataEng.,SanDiego,CA,Feb.28–Mar.3,2000,p.306.
[14]D.Cheung,J.Han,V.Ng,andC.Y.Wong,“Maintenanceofdiscovered
associationrulesinlargedatabases:Anincrementalupdatingtechniques,”inProc.Int.Conf.DataEng.,Feb.1996,pp.106–114.
[15]F.Coenen,G.Goulbourne,andP.Leng,“Computingassociationrules
usingpartialtotals,”inProc.5thEur.Conf.PrinciplesofDataMiningandKnowledgeDiscovery,Freiburg,Germany,Sep.3–5,2001,pp.54–66.[16]R.Cooley,B.Mobasher,andJ.Srivastava,“Datapreparationformining
worldwidewebbrowsingpatterns,”J.Knowl.Inf.Syst.,vol.1,no.1,1999.
[17]G.Das,K.-I.Lin,H.Mannila,G.Renganathan,andP.Smyth,“Rule
discoveryfromtimeseries,”Proc.4thInt.Conf.KnowledgeDiscoveryandDataMining(KDD-98),Aug.1998.
[18]U.M.Fayyad,G.Piatetsky-Shapiro,P.Smyth,andR.Uthurasamy,Ad-vancesinKnowledgeDiscoveryandDataMining.Cambridge,MA:MITPress,1996.
[19]R.Floyd,B.Housel,andC.Tait,“Mobilewebaccessusingenetworkweb
express,”IEEEPers.Commun.,vol.5,no.5,pp.47–52,Oct.1998.
[20]BluetoothOverview,1999[Online].Available:http://www.bluetooth.com[21]WAPForumWirelessApplicationProtocol.[Online].Available:
http://www.wapforum.org/
[22]J.Han,G.Dong,andY.Yin,“Efficientminingofpartialperiodicpatterns
intimeseriesdatabase,”inProc.15thInt.Conf.DataEngineering,Mar.1999.
[23]J.HanandY.Fu,“Discoveryofmultiple-levelassociationrulesfromlarge
databases,”inProc.21thVLDBConf.,Sep.1995,pp.420–431.
[24]J.Han,W.Gong,andY.Yin,“Miningsegment-wiseperiodicpatternsin
time-relateddatabases,”inProc.4thInt.Conf.KnowledgeDiscoveryandDataMining(KDD-98),Aug.1998,pp.214–218.
[25]J.Han,L.V.S.Lakshmanan,andR.T.Ng,“Constraint-based,multidi-mensionaldatamining,”IEEEComput.,vol.32,no.8,pp.46–50,Aug.1999.
[26]J.Han,J.Pei,B.Mortazavi-Asl,Q.Chen,U.Dayal,andM.-C.Hsu,
“FreeSpan:Frequentpattern-projectedsequentialpatternmining,”inProc.2000Int.Conf.KnowledgeDiscoveryandDataMining,Aug.2000,pp.355–359.
[27]J.Han,J.Pei,andY.Yin,“Miningfrequentpatternswithoutcandidate
generation,”inProc.ACMSIGMODConf.,2000.
[28]L.V.S.Lakshmanan,R.Ng,J.Han,andA.Pang,“Optimization
ofconstrainedfrequentsetquerieswith2-variableconstraints,”inProc.ACM-SIGMODConf.ManagementofData,Jun.1999,pp.157–168.
[29]C.-H.Lee,C.-R.Lin,andM.-S.Chen,“Onmininggeneraltemporal
associationrulesinapublicationdatabase,”inProc.1stIEEEInt.Conf.DataMining(ICDM2001),Nov./Dec.2001.[30],“Sliding-windowfiltering:Anefficientalgorithmforincremental
mining,”inProc.ACM10thInt.Conf.Inf.andKnowledgeManagement,Nov.2001.
[31]C.-R.Lin,C.-H.Yun,andM.-S.Chen,“Usingslicescanandselective
hashforepisodemining,”inProc.WorkshoponTemporalDataMining(SIGKDD2001),Aug.2001.
[32]J.-L.LinandM.H.Dunham,“Miningassociationrules:Anti-skewalgo-rithms,”inProc.Int.Conf.DataEngineering,1998,pp.486–493.
[33]M.-Y.LinandS.-Y.Lee,“Incrementalupdateonsequentialpatternsin
largedatabases,”inProc.10thIEEEInt.Conf.ToolswithArtificialIntel-ligence,Nov.1998,pp.24–31.
[34]Y.-B.Lin,“ModelingtechniquesforPCSnetworks,”IEEECommun.
Mag.,vol.35,no.2,pp.102–107,Feb.1997.
[35]B.Liu,W.Hsu,andY.Ma,“Miningassociationruleswithmultiple
minimumsupports,”inProc.5thACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,Aug.1999.
[36]Ky.S.M.N.GarofalakisandR.Rastogi,“SPIRIT:Sequentialpattern
miningwithregularexpressionconstraints,”VLDBJ.,1999.
[37]H.MannilaandC.Meek,“Globalpartialordersfromsequentialdata,”
inProc.6thACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,Boston,MA,2000,pp.161–168.
[38]H.Mannila,H.Toivonen,andA.I.Verkamo,“Discoveryoffrequent
episodesineventsequences,”DataMiningKnowl.Discov.,vol.1,no.3,pp.259–289,1997.
[39]F.Masseglia,F.Cathala,andP.Poncelet,“ThePSPapproachformining
sequentialpatterns,”inProc.2ndEur.Conf.PrinciplesofDataMiningandKnowledgeDiscovery,Nantes,France,Sep.1998,vol.1510,pp.176–184.
[40]E.ModianoandA.Ephremides,“Efficientalgorithmsforperforming
packetbroadcastsinameshnetwork,”IEEE/ACMTrans.Networking,vol.4,no.4,pp.639–648,Aug.1996.
[41]R.Muntz,W.Wang,andJ.Yang,“TAR:Temporalassociationruleson
evolvingnumericalattributes,”inProc.17thInt.Conf.DataEngineering,2001.
[42]B.PadmanabhanandA.Tuzhilin,“Patterndiscoveryintemporal
databases:Atemporallogicapproach,”inProc.2ndInt.Conf.Knowl-edgeDiscoveryandDataMining,1996,pp.35–354.
[43]J.-S.Park,M.-S.Chen,andP.S.Yu,“Aneffectivehashbasedalgorithm
forminingassociationrules,”inProc.ACMSIGMODConf.,May1995,pp.175–186.
[44]S.Parthasarathy,M.J.Zaki,M.Ogihara,andS.Dwarkadas,“Incremental
andinteractivesequencemining,”inProc.ACMCIKMInt.Conf.Inf.andKnowledgeManagement,1999,pp.251–258.
YUNANDCHEN:MININGMOBILESEQUENTIALPATTERNSINAMOBILECOMMERCEENVIRONMENT295
[45]N.Pasquier,Y.Bastide,R.Taouil,andL.Lakhal,“Discoveringfrequent
closeditemsetsforassociationrules,”inProc.7thInt.Conf.DatabaseTheory,Jan.1999.
[46]J.PeiandJ.Han,“Canwepushmoreconstraintsintofrequentpattern
mining?”inProc.2000Int.Conf.KnowledgeDiscoveryandDataMining,Aug.2000.
[47]J.Pei,J.Han,andR.Mao,“CLOSET:Anefficientalgorithmformining
frequentcloseditemsets,”inProc.ACM-SIGMODWorkshoonRes.IssuesinDataMiningandKnowledgeDiscovery,2000,pp.21–30.
[48]J.Pei,J.Han,H.Pinto,Q.Chen,U.Dayal,andM.-C.Hsu,“PrefixSpan:
Miningsequentialpatternsefficientlybyprefix-projectedpatterngrowth,”inProc.17thInt.Conf.DataEng.,Apr.2001.
[49]W.-C.PengandM.-S.Chen,“Developingdataallocationschemesbyin-crementalminingofusermovingpatternsinamobilecomputingsystem,”IEEETrans.Knowl.DataEng.,vol.15,no.1,pp.70–85,Feb.2003.[50]P.PirolliandJ.E.Pitkow,“DistributionsofSurfers’pathsthroughthe
worldwideweb:Empiricalcharacterization,”WorldWideWeb,vol.2,no.1–2,pp.29–45,1999.
[51]J.E.PitkowandP.Pirolli,“Mininglongestrepeatedsubsequencesto
predictworldwidewebsurfing,”inProc.2ndUSENIXSymp.InternetTechnologiesandSyst.,Oct.1999.
[52]J.B.Schafer,J.Konstan,andJ.Riedl,“RecommendersystemsinE-commerce,”inProc.1stACMConf.ElectronicCommerce,Denver,CO,Nov.1999,pp.158–166.
[53]R.SrikantandR.Agrawal,“Mininggeneralizedassociationrules,”in
Proc.21thVLDBConf.,Sep.1995,pp.407–419.
[54]R.AgrawalandR.Srikant,“Miningquantitativeassociationrulesinlarge
relationaltables,”inProc.1996ACM-SIGMODConf.ManagementofData,1996,pp.1–12.
[55]R.SrikantandR.Agrawal,“Miningsequentialpatterns:Generalizations
andperformanceimprovements,”inProc.1996Int.Conf.ExtendingDatabaseTechnology(EDBT’96),Mar.1996,pp.201–212.
[56]A.K.H.Tung,H.Lu,J.Han,andL.Feng,“Breakingthebarrierof
transactions:Miningintertransactionassociationrules,”inProc.5thACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,Aug.1999,pp.297–301.
[57]U.Varshney,R.J.Vetter,andR.Kalakota,“Mobilecommerce:Anew
frontier,”IEEEComput.,vol.33,no.10,pp.32–38,Oct.2000.
[58]J.Veijalainene,“Transactionsinmobileelectroniccommerce,”inProc.
8thInt.WorkshoponFoundationsofModelsandLanguagesforDataandObjects,Sep.1999,pp.203–224.
[59]R.Villafane,K.A.Hua,D.Tran,andB.Maulik,“Knowledgediscovery
fromseriesofintervalevents,”J.Intell.Inf.Syst.,vol.15,no.1,pp.71–89,2000.
[60]K.Wang,Y.He,andJ.Han,“Miningfrequentitemsetsusingsupport
constraints,”inProc.26thVLDBConf.,Sep.2000,pp.43–52.
[61]K.Wang,S.Q.Zhou,andS.C.Liew,“Buildinghierarchicalclassifiers
usingclassproximity,”inProc.25thVLDBConf.,1999,pp.363–374.[62]C.Yang,U.Fayyad,andP.Bradley,“Efficientdiscoveryoferror-tolerant
frequentitemsetsinhighdimensions,”inProc.7thACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,SanFrancisco,CA,2001,pp.194–203.
[63]J.Yang,W.Wang,andP.S.Yu,“Miningasynchronousperiodicpatterns
intimeseriesdata,”inProc.6thACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,2000,pp.275–279.
[64]C.-H.YunandM.-S.Chen,“Usingpattern-joinandpurchase-combination
forminingwebtransactionpatternsinanelectroniccommerceenviron-ment,”inProc.24thIEEEAnnu.Int.ComputerSoftwareandApplicationConf.,Oct.2000,pp.99–104.
[65]M.J.Zaki,“Sequenceminingincategoricaldomains:Incorporatingcon-straints,”inProc.2000ACMCIKMInt.Conf.Inf.andKnowledgeMan-agement,Nov.2000.[66],“SPADE:Anefficientalgorithmforminingfrequentsequences,”
Mach.Learn.,vol.42,no.1/2,pp.31–60,2001.
[67]M.J.ZakiandC.Hsiao,“CHARM:Anefficientalgorithmforclosed
itemsetmining,”inProc.2ndSIAMConf.DataMining,Apr.2002.[Online].Available:http://www.siam.org/meetings/sdm02/proceedings/sdm02-27.pdf
[68]M.J.Zaki,S.Parthasarathy,M.Ogihara,andW.Li,“Newalgorithmsfor
fastdiscoveryofassociationrules,”inProc.3rdACMSIGKDDInt.Conf.KnowledgeDiscoveryandDataMining,Aug.1997,pp.283–286.
[69]Q.Zheng,K.Xu,S.Ma,andW.Lv,“Thealgorithmsofupdatingsequential
patterns,”inProc.5thInt.WorkshoponHighPerformanceDataMininginconjunctionwith2ndSIAMConf.DataMining,2002.[Online].Available:http://arxiv.org/abs/cs.DB/0203027
Ching-HuangYunreceivedtheB.S.andPh.D.de-greesinelectricalengineeringfromtheNationalTaiwanUniversity,Taipei,Taiwan,R.O.C.,in1997and2005,respectively.
HeiscurrentlyanR&DmanagerwithIBTekInc.,Taipei.Hisresearchinterestsincludedatamining,webtechnologies,andvideosurveillancesystems.
Ming-SyanChen(S’88–M’88–SM’93–F’04)re-ceivedtheB.S.degreeinelectricalengineeringfromtheNationalTaiwanUniversity,Taipei,Taiwan,R.O.C.,andtheM.S.andPh.D.degreesincomputer,information,andcontrolengineeringfromTheUni-versityofMichigan,AnnArbor,in1985and1988,respectively.
HeiscurrentlyaProfessorandtheChairmanoftheGraduateInstituteofCommunicationEngineer-ingandaProfessorwiththeElectricalEngineeringDepartmentandtheComputerScienceandInforma-tionEngineeringDepartment,NationalTaiwanUniversity.HewasaResearchStaffMemberattheIBMThomasJ.WatsonResearchCenter,YorktownHeights,NY,from1988to1996.Hisresearchinterestsincludedatabasesystems,datamining,mobilecomputingsystems,andmultimedianetworking,andhehaspublishedmorethan200papersinhisresearchareas.Heholds,orhasappliedfor,18U.S.patentsandsevenR.O.C.patentsintheareasofdatamining,Webapplications,interactivevideoplayout,videoserverdesign,andconcurrencyandcoherencycontrolprotocols.HeiscurrentlyontheeditorialboardoftheVeryLargeDataBase(VLDB)Journal,theKnowledgeandInformationSystems(KAIS)Journal,theJournalofInformationScienceandEngineering,andtheInternationalJournalofElectricalEngineering.
Dr.ChenservedasanAssociateEditorofIEEETRANSACTIONSONKNOWLEDGEANDDATAENGINEERINGfrom1997to2001.HewasaDistin-guishedVisitoroftheIEEEComputerSocietyforAsia-Pacificfrom1998to2000.HeservedastheProgramChairofPacificAreaKnowledgeDiscoveryandDataMining,InternationalViceChairofINFOCOM2005,andProgramVice-ChairofIEEEICDCS2005,ICPP2003,andVLDB-2002.HewasakeynotespeakeronWebdataminingattheInternationalComputerCongressinHongKong,1999,atutorialspeakeronWebdataminingatDASFAA-1999,andonparalleldatabasesatthe11thIEEEICDEin1995.HewasalsoaGuestCoeditorforIEEETKDEonaspecialissueondatamininginDecember1996.HeisarecipientoftheNationalScienceCouncilDistinguishedResearchAwardandK.-T.LiResearchPenetrationAwardforhisresearchwork.HealsoreceivedtheOutstandingInnovationAwardfromIBMCorporateforhiscontributiontoamajordatabaseproductandreceivednumerousawardsforhisresearch,teaching,inventions,andpatentapplications.HeisaMemberoftheACM.
因篇幅问题不能全部显示,请点此查看更多更全内容