EzDevInfo.com

scoobi

A Scala productivity framework for Hadoop. Index

http://nicta.github.com/scoobi/

Scala mkString equivalent in Scoobi Dlist

I am trying to use folds or better yet, mkString to concatenate the strings I have in a DList

val dlist = DList("a", "b", "c")

so that I get a single string : "abc"

Apparently, for a regular list in scala, list.mkString does the job.

Merging a list of Strings using mkString vs foldRight

is there an easy way to do it in scoobi's Distributed Lists?

Source: (StackOverflow)

Hadoop Scoobi : instantiate a DList from XML

I am fairly new to Scala, Hadoop & Scoobi.

We have some hadoop jobs where we process CSV files and do the Scoobi routines with

  // Parse the input file
  val lines = fromTextFile(input)

  // Iterate on every element to generate the keys, and then aggregate it
  val counts = lines.mapFlatten( ...

1. I have the impression that I can't do it for XML files. Is that so? or can i process XML with Scoobi?

2. I think I can parse and flatten the XML nodes to a lines with scala native xml. But then how do I create a Scoobi DList.

(why? because I will need to join it with another one coming from a CSV file)

Note : My xml consists of nodes like the following :

 <add>
    <AdCampaign class="BCSAdCampaign">
        <Subscriber>TVC</Subscriber>
        <CampaignName>3402376</CampaignName>
        <CampaignId>1NTGXNAY</CampaignId>
        <AccountManager/>
        <FromDate>20130212</FromDate>
        <ToDate>20140207</ToDate>
        <ReportingInd>N</ReportingInd>
        <CampaignAdmin>NAWASTHI MCG-TVC</CampaignAdmin>
        <SalesChannel>TC8</SalesChannel>
        <Email/>
        <Advertiser>MU0</Advertiser>
        <Date>20150609</Date>
    </AdCampaign>
</add>

Source: (StackOverflow)