python - How to write unit tests for a sequence of data transformations? -
i trying learn tdd while writing script transform input data in long series of functions. problem similar whether write in python or r. guess it's more related tdd understanding.
# of main in python def main(): data = get_data() data_a = transform_fun1(data) data_b = transform_fun2(data_a) data_c = transform_fun3(data_b) .... return data_x # of main in r main <- function() { data <- get_data() %>% transform_fun1() %>% transform_fun2() %>% transform_fun3() %>% ... data_x }
what's best process write unit tests each transform_fun
, knowing need input result of previous transform_fun
?
in beginning looks clean, further , further, start reproduce more , more of main
in each test, doesn't smell good. reproducing entire parts of main
process looks counter-intuitive idea of unit testing.
# in python (pytest) def test_transform_fun_n(data): data_a = transform_fun1(data) data_b = transform_fun2(data_a) ... data_n = transform_fun_n(data_n-1) assert data_n == blabla # in r (testthat) test_that("transform_fun_n expect", { data_a <- transform_fun1(data) data_b <- transform_fun2(data_a) ... data_n <- transform_fun_n(data_n-1) expect_that(data_n, equals(blabla)) })
i tried add fixture between each step (at least in python) doesn't ideal either.
-- edit -- trying sketch voiceofunreason's answer like.
def transformv1(data): return data + x def transformv2(data): return transformv1(data) + y def transformv3(data): return transformv2(data) + z def main(): data = get_data() return transformv3(data)
in beginning looks clean, further , further, start reproduce more , more of main in each test, doesn't smell good. reproducing entire parts of main process looks counter-intuitive idea of unit testing.
yes, right. code trying tell specifications (and production code) written @ wrong abstraction level.
def test_transformv1(data, expected): actual = transformv1(data) assert actual == expected def main(): data = getdata() return transformv1(data)
when requirements change, write new test, new specification
def test_transformv2(data, expected): actual = transformv2(data) assert actual == expected def test_transformv1(data, expected): actual = transformv1(data) assert actual == expected def main(): data = getdata() return transformv2(data)
the key ideas here being (a) unit tests exercise functions provided production code (b) new requirements mean new function -- new function may implemented in terms of others, test checks new function returns right result.
if main hard test (a common problem an imperative shell), want make thin possibly can.
make simple there no deficiencies
long chains of transformations need refactored shell core; given name, , on.
do mean code should written more added @ end of question
yes, that's idea: imperative shell accesses functional core using same entry point 1 of tests.
Comments
Post a Comment