replicate in Ruby

Replicate each row of data.frame and specify the number of replications for each row

df <- data.frame(var1=c('a', 'b', 'c'), var2=c('d', 'e', 'f'), freq=1:3)

What is the simplest way to expand the first two columns of the data.frame above, so that each row appears the number of times specified in the column 'freq'?

In other words, go from this:

df
  var1 var2 freq
1    a    d    1
2    b    e    2
3    c    f    3

To this:

df.expanded
  var1 var2
1    a    d
2    b    e
3    b    e
4    c    f
5    c    f
6    c    f

Source: (StackOverflow)

How to duplicate a MySQL database on the same server

I have a large MySQL database, lets call it live_db, which I want to replicate on the same machine to provide a test system to play around with (test_db), including table structure and data. In regular intervals I want to update the test_db with the content of the live_db; if possible incremental.

Is there some built-in mechanism in MySQL to do that? I think that master-slave replication is not the thing I want since it should be possible to alter data in the test_db. These changes do not have to be preserved, though.

Regards,

CGD

Source: (StackOverflow)

Replicate() verses a for loop?

Does anyone know how the replicate() function works in R and how efficient it is relative to using a for loop?

For example, is there any efficiency difference between...

means <- replicate(100000, mean(rnorm(50)))

And...

means <- c()
for(i in 1:100000) { 
   means <- c(means, mean(rnorm(50)))
}

(I may have typed something slightly off above, but you get the idea.)

Source: (StackOverflow)

Using function "cat" with "replicate" in R

Is there a way how to combine function "cat" with function "replicate" in R?

I want to see number of "loops" R has already made at a particular moment. However, instead of using "for" loop, I prefer to use "replicate". See the simple example below:

 Data <- rnorm(20,20,3)

 # with for loop
 N <- 1000
 outcome <- NULL

 for(i in 1:N){
      Data.boot <- sample(Data, replace=TRUE)
      outcome[i] <- mean(Data.boot)
      cat("\r", i, "of", N)
 }

  #the same but with replicate 
  f <- function() {
  Data.boot <- sample(Data, replace=TRUE)
  outcome <- mean(Data.boot)
  return(outcome)
  }
  replicate(N, f())

Thus, any ideas how to implement function "cat" with "replicate" (as well as other approaches to see a number of how many times the function of interest has been executed with "replicate") would be very appreciated. Thank you!

Source: (StackOverflow)

R data.table efficient replication by group

I am running into some memory allocation problems trying to replicate some data by groups using data.table and rep.

Here is some sample data:

ob1 <- as.data.frame(cbind(c(1999),c("THE","BLACK","DOG","JUMPED","OVER","RED","FENCE"),c(4)),stringsAsFactors=FALSE)
ob2 <- as.data.frame(cbind(c(2000),c("I","WALKED","THE","BLACK","DOG"),c(3)),stringsAsFactors=FALSE)
ob3 <- as.data.frame(cbind(c(2001),c("SHE","PAINTED","THE","RED","FENCE"),c(1)),stringsAsFactors=FALSE)
ob4 <- as.data.frame(cbind(c(2002),c("THE","YELLOW","HOUSE","HAS","BLACK","DOG","AND","RED","FENCE"),c(2)),stringsAsFactors=FALSE)
sample_data <- rbind(ob1,ob2,ob3,ob4)
colnames(sample_data) <- c("yr","token","multiple")

What I am trying to do is replicate the tokens (in the present order) by the multiple for each year.

The following code works and gives me the answer I want:

good_solution1 <- ddply(sample_data, "yr", function(x) data.frame(rep(x[,2],x[1,3])))

good_solution2 <- data.table(sample_data)[, rep(token,unique(multiple)),by = "yr"]

The issue is that when I scale this up to 40mm+ rows, I get into memory issues for both possible solutions.

If my understanding is correct, these solutions are essentially doing an rbind which allocates everytime.

Does anyone have a better solution?

I looked at set() for data.table but was running into issues because I wanted to keep the tokens in the same order for each replication.

Thanks ahead of time!

Source: (StackOverflow)

How to use Replicate in MS Access Database?

In Sql Server for replication i use the following query

select replicate('0',5)

it gives result as

what is the equivalent for this in MS Access

Source: (StackOverflow)

Haskell ReplicateM IO

I'm trying to create a function which allows the user to input a list of strings. The function takes the length and allows the user to input length-1 more lines. Then each line is checked to ensure it is the same length as the original line. However, I'm having a few problems and that I can't find a solution to.

The problems are that I can input more than count-1 lines and the length isn't being calculated as I expected.. for example if I input ["12","13"] and then ["121","13"] the error is given, although they are the same length!

read :: IO [Line]
read = do
  line <- getLine
  let count = length line
  lines <- replicateM (count-1) $ do
    line <- getLine
    if length line /= count
    then fail "too long or too short"
    else return line
  return $ line : lines

Line is of type String.

readLn gives a parse error.

Source: (StackOverflow)

R how to grow a data.table

I've a large data.table which looks like

custid, dayofweek, revenue
AA 2 345
AA 3 545
BB 1 544
BB 4 456
CC 7 231

I would like to "grow" this data table such that it has all 7 numbers for each custid with the revenue column set to NA. Example shown below.

custid, dayofweek, revenue
AA 1 NA
AA 2 345
AA 3 545
AA 4 NA
AA 5 NA
AA 6 NA
AA 7 NA
BB 1 544
BB 2 NA
BB 3 NA
BB 4 456
BB 5 NA
BB 6 NA
BB 7 NA
CC 1 NA
CC 2 NA
CC 3 NA
CC 4 NA
CC 5 NA
CC 6 NA
CC 7 231

Growing it that way is definitely not a join operation. Any help appreciated. Thanks in advance.

Source: (StackOverflow)

Perl Hash Slice, Replication x Operator, and sub params

Ok, I understand perl hash slices, and the "x" operator in Perl, but can someone explain the following code example from here (slightly simplified)?

sub test{
    my %hash;
    @hash{@_} = (undef) x @_;
}

Example Call to sub:

test('one', 'two', 'three');

This line is what throws me:

@hash{@_} = (undef) x @_;

It is creating a hash where the keys are the parameters to the sub and initializing to undef, so:

%hash:

'one' => undef, 'two' => undef, 'three' => undef

The rvalue of the x operator should be a number; how is it that @_ is interpreted as the length of the sub's parameter array? I would expect you'd at least have to do this:

@hash{@_} = (undef) x scalar @_;

Source: (StackOverflow)

R How to replicate nulls in a list

list(list(NULL,NULL),list(NULL,NULL))

The result is:

[[1]]
[[1]][[1]]
NULL

[[1]][[2]]
NULL

[[2]]
[[2]][[1]]
NULL

[[2]][[2]]
NULL

Supposing I want to do this for larger numbers than 2, is there a way to get the same list structure with replicate?

Source: (StackOverflow)

Haskell: recursive Replicate function

I'm just getting started with Haskell. I'm trying to create a function that imitates the standard replicate function in Haskell, but using recursion. For example,

Prelude> replicate 3 "Ha!"
["Ha!","Ha!","Ha!"]

It should be of type Int -> a -> [a]. So far I have:

myReplicate :: Int -> a -> [a]
myReplicate x y = y : myReplicate (x-1) y
myReplicate 0 y = [ ]

However, my function always generates infinite lists:

Prelude> myReplicate 3 "Ha!"
["Ha!","Ha!","Ha!","Ha!","Ha!","Ha!","Ha!",...

Source: (StackOverflow)

Repeating a repeated sequence

We want to get an array that looks like this:

1,1,1,2,2,2,3,3,3,4,4,4,1,1,1,2,2,2,3,3,3,4,4,4,1,1,1,2,2,2,3,3,3,4,4,4

What is the easiest way to do it?

Source: (StackOverflow)

Is there a way to access the iteration number in replicate()?

Is there some way to access the current replication number in the replicate function so I can use it as a variable in the repeated evaluation? For example in this trivial example I'd like to use the current replication number to generate a list of variable length vectors of the current replication number. For example, x below would represent the current replicate:

replicate( 3 , rep( x , sample.int(5,1) ) )

I know this trivial example is easy to do with lapply

lapply( 1:3 , function(x) rep( x , sample.int(5,1) ) )

But can you access the replication counter in replicate?

Source: (StackOverflow)

replicate() class xts into a list

I have an xts object frame

frame <- structure(c("a", "a", "a"), .Dim = c(3L, 1L), index = structure(c(946702800, 
946749600, 946796400), tzone = "", tclass = c("POSIXct", "POSIXt"
)), class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"
), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "")


> frame
                    [,1]
2000-01-01 05:00:00 "a" 
2000-01-01 18:00:00 "a" 
2000-01-02 07:00:00 "a"

I want to make a list of this xts object with length 5.

but when I do this I lose the date and time... how can I create a list of replicate xts objects without losing the xts class?

> class(frame)
[1] "xts" "zoo"
> class( replicate(5, frame)[1])
[1] "character"

> replicate(5, frame)
, , 1

     [,1]
[1,] "a" 
[2,] "a" 
[3,] "a"    # seriously... :(

.........

Source: (StackOverflow)

How are BRR weights used in the survey package for R?

Does anyone know how to use BRR weights in Lumley's survey package for estimating variance if your dataset already has BRR weights it in?

I am working with PISA data, and they already include 80 BRR replicates in their dataset. How can I get as.svrepdesign to use these, instead of trying to create its own? I tried the following and got the subsequent error:

dstrat <- svydesign(id=~uniqueID,strata=~strataVar, weights=~studentWeight, 
                data=data, nest=TRUE)
dstrat <- as.svrepdesign(dstrat, type="BRR")

Error in brrweights(design$strata[, 1], design$cluster[, 1], ..., 
    fay.rho = fay.rho,  : Can't split with odd numbers of PSUs in a stratum

Any help would be greatly appreciated, thanks.

Source: (StackOverflow)