How Model 2 Fact Tables with some common dimensions!

Hello

trying to model 2 Fact Tables with some common dimensions,

I managed to generate the model but the results of the Query are very wrong, it is like it is doing a cross join or something

is there a documentation on how to model multiple fact tables ?

thanks

dataset BIENGINETESTING
// this is an auto-generated default dataset
// please feel free to delete it, if not used
table actual budget category_dim
join actual inner join category_dim
on actual.category = category_dim.category
join budget inner join category_dim
on budget.category = category_dim.category

Hi Mim - thanks for your question. I’ve flagged with the customer support team and someone will get back to you ASAP - Jeremy

Update
Or maybe our CEO who invented our data modeling language will jump in…

Do you have a one-to-many relationship between each fact table table and the dimension table?

Fact1
id metric1
1 2
1 3

Fact2
id metric2
1 20
1 30

DimensionTable
id
1

In that case, according to standard SQL semantics, a simultaneous join is equivalent to a cross join.

Fact1 JOIN Fact1 JOIN DimensonTable
id metric1 metric2
1 2 20
1 2 30
1 3 20
1 3 30

Please let us know if our understanding is correct. Whether we are right or wrong in our interpretation of your problem, if we know your intent and how your tables are related, we can help find the correct solution.

BTW if you haven’t already done so, you can look at the auto-generated SQL by clicking on “View SQL” (near top right) when you are exploring a dataset.

here is my model

I expect the SQL to be something like this

SELECT category,
actualhours ,
budgethours ,

FROM (
SELECT category, SUM(actualhours) AS actualhours
FROM xxxxxxxx.starschema.actual

GROUP BY category
) t2
FULL OUTER JOIN (
SELECT category, SUM(budgethours) AS budgethours
FROM xxxxxxx.starschema.budget
GROUP BY category
) t3 USING(category)

We strive to provide general and powerful solutions to wide a variety of cases, while respecting basic semantics of SQL.

I understand if you expect the above SQL to be generated from your data model. Generalizing it to the case where the two JOINs use different columns may be hard. We will certainly look into this.

Meanwhile, one possible data model that would produce queries similar to your intent is as follows.

table budget
  sql xxxxxxx.starschema.budget
  datasource <yourDataSource>
  ...

table actual
  sql xxxxxxx.starschema.actual
  datasource <yourDataSource>
  ...

table joinableBudget
  sql {budget category sumBudgetHours}
  datasource <yourDataSource>
  column sumBudgetHours
    sql SUM(budgethours)
    type int
  column category
    sql ...
  ...

table joinableActual
  sql {actual category sumActualHours}
  datasource <yourDataSource>
  column sumActualHours
    sql SUM(actualhours)
    type int
  column category
    sql ...
  ...


dataset BIENGINETESTING
  table joinableBudget joinableActual
  join joinableBudget full outer join category_dim
    on joinableBudget.category = category_dim.category
  join joinableActual full outer join category_dim
    on joinableActual.category = category_dim.category

We could have defined a join rule between joinableBudget and joinableActual directly. But the above solution will work even when you select additional columns from category_dim.

The {} syntax for a table sql is documented here: Views using Structured SQL | bipp Analytics

As always, we solicit and welcome any feedback. If any of this looks unnecessarily complex, we would love to hear back from you.

1 Like

Thanks a lot, that’s very interesting approach, maybe it will be useful to show cases like this in the documentation

Yes, I completely agree. Out documentation is still far from ideal. We are working on it.