python - How to write unit tests for a sequence of data transformations? -

May 15, 2013

i trying learn tdd while writing script transform input data in long series of functions. problem similar whether write in python or r. guess it's more related tdd understanding.

# of main in python def main():     data = get_data()     data_a = transform_fun1(data)     data_b = transform_fun2(data_a)     data_c = transform_fun3(data_b)     ....     return data_x  # of main in r main <- function() {     data <- get_data() %>%       transform_fun1() %>%       transform_fun2() %>%       transform_fun3() %>%       ...     data_x }

what's best process write unit tests each transform_fun, knowing need input result of previous transform_fun?

in beginning looks clean, further , further, start reproduce more , more of main in each test, doesn't smell good. reproducing entire parts of main process looks counter-intuitive idea of unit testing.

# in python (pytest) def test_transform_fun_n(data):     data_a = transform_fun1(data)     data_b = transform_fun2(data_a)     ...     data_n = transform_fun_n(data_n-1)     assert data_n == blabla  # in r (testthat) test_that("transform_fun_n expect", {     data_a <- transform_fun1(data)     data_b <- transform_fun2(data_a)     ...     data_n <- transform_fun_n(data_n-1)     expect_that(data_n, equals(blabla)) })

i tried add fixture between each step (at least in python) doesn't ideal either.

-- edit -- trying sketch voiceofunreason's answer like.

def transformv1(data):      return data + x  def transformv2(data):      return transformv1(data) + y  def transformv3(data):      return transformv2(data) + z  def main():      data = get_data()      return transformv3(data)

in beginning looks clean, further , further, start reproduce more , more of main in each test, doesn't smell good. reproducing entire parts of main process looks counter-intuitive idea of unit testing.

yes, right. code trying tell specifications (and production code) written @ wrong abstraction level.

def test_transformv1(data, expected):     actual = transformv1(data)     assert actual == expected  def main():     data = getdata()     return transformv1(data)

when requirements change, write new test, new specification

def test_transformv2(data, expected):     actual = transformv2(data)     assert actual == expected  def test_transformv1(data, expected):     actual = transformv1(data)     assert actual == expected  def main():     data = getdata()     return transformv2(data)

the key ideas here being (a) unit tests exercise functions provided production code (b) new requirements mean new function -- new function may implemented in terms of others, test checks new function returns right result.

if main hard test (a common problem an imperative shell), want make thin possibly can.

make simple there no deficiencies

long chains of transformations need refactored shell core; given name, , on.

do mean code should written more added @ end of question

yes, that's idea: imperative shell accesses functional core using same entry point 1 of tests.

Search This Blog

RT

python - How to write unit tests for a sequence of data transformations? -

Comments

Post a Comment

Popular posts from this blog

javascript - Replicate keyboard event with html button -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

Ansible warning on jinja2 braces on when -