Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions CONVERSION_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# WordPress to Markdown Conversion

This repository contains 53 blog posts that were converted from WordPress export files to Jekyll-compatible Markdown format.

## Conversion Process

The WordPress export files were located in `_drafts/will_not_backport/` as text files with the naming pattern `wp_post_*.txt`.

The conversion was performed using the `convert_wordpress_to_markdown.py` script, which:

1. Parsed WordPress export format (key-value pairs with quoted strings)
2. Extracted metadata (title, date, post_status, post_type, etc.)
3. Filtered for published posts only (`post_status: "publish"` and `post_type: "post"`)
4. Converted HTML content to Markdown using the html2text library
5. Generated Jekyll front matter with appropriate metadata
6. Created properly named markdown files in the format `YYYY-MM-DD-slug.md`

## Converted Posts

- **Total WordPress files**: 278
- **Published posts converted**: 53
- **Date range**: 2011-05-19 to 2017-09-29

## Known Limitations

The WordPress export format used non-standard character encoding where the letter 'n' was used to represent newlines in the exported text. The conversion script attempts to handle this, but due to the complexity of distinguishing between:
- 'n' as a newline character
- 'n' as part of a word (like "cannot", "connection", "application")
- HTML entity encoding (like `(` for parentheses)

Some formatting issues remain in the converted posts:

- **Standalone 'n' characters**: May appear in some posts as conversion artifacts
- **Concatenated words**: Words like "connection" may appear as "co ection"
- **Code blocks**: May have formatting issues due to complex HTML entity encoding
- **Special characters**: Some may not be perfectly converted

**Recommendation**: These posts serve as a historical record and baseline. Individual posts can be manually corrected as needed when they are accessed or edited in the future.

## Usage

To re-run the conversion (if needed):

```bash
python3 convert_wordpress_to_markdown.py
```

Note: This will require the `html2text` Python package to be installed:

```bash
pip3 install html2text
```
69 changes: 69 additions & 0 deletions _posts/2011-05-19-using-ranges-and-functional-programming-in-c.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
layout: post
title: "Using Ranges and Functional Programming in C++"
date: 2011-05-19 21:47:12
categories: blog
---
n n n n n n n n n C++ is a very versatile language. Among other things, you can do generic meta-programming and functional programming in C++, as well as the better-known facilities for procedural and object-oriented programming. In this installment, we will look at the functional programming facilities in the now-current C++ standard (C++03) as well as the upcoming C++0x standard. We will look at what a _closure_ is and how to apply one to a range, but we will first look at some simpler uses of ranges — to warm up.

If you look at the current version of Chausette, in the code for episode 28, you will find this:

```cpp int mai (int argc, const char **argv) { Applicatio ::Arguments arguments(argc); std::copy(argv, argv + argc, arguments.begi ()); Application applicatio ; try { application.ru (arguments); } catch (...) { std::cerr << "An error occurred" << std::endl; } } ```

On line 4 of this listing, you can see our first use of a range: using `copy`, we copy the range of arguments passed to the application into the `arguments` vector.[1](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/#footnote_0_1399 "Note that this is not functional programming \(yet\), but in order to understand how functional programming is thought of in \(current\) C++, it is important to understand how ranges work.") The range that contains all the arguments is `argc` in size (which is why the vector is initialized to contain `argc` elements) and starts at `argv`. This same approach to ranges works for all C-style arrays: the `begi ` ing of the range points at the first element, the `end` of the range points one past the last element. We note a range like this: `[begin, end)`. Using `begi ` and `end` in this ma er works for STL containers as well, and is the basic premise for all STL algorithms.

If you look at the code for `std::copy` you'll find something like this[2](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/#footnote_1_1399 "The real code will likely be more complicated because of some optimizations the implementation may do, but the general idea is the same."):

```cpp template < typename InIter, typename OutIter > OutIter copy(InIter begin, InIter end, OutIter result) { for (; begin != end; ++begi ) { *result++ = *begi ; } retur result; } ```

So why not implement the loop directly?

There are many reasons not to implement the loop directly in the code. One is the age-old reason of code re-use. It is for that reason that we practice object-oriented programming, that we have libraries of code and that we have functions. We re-use code because that means we don't have to write as much code (laziness is a virtue in this case) and because we only have to debug the code once. If the code is well-written, having debugged it once means we don't even have to look at it ever again.

For those same reasons, C++ has generic template meta-programming, allowing `copy` to be used for any sort of range containing elements of any type - as long as they are **Assignable**. In this case, we've used it to implement copying a range of C-style strings into a vector of C++-style strings but the same code can copy arrays of integers, the contents of STL containers, etc. Note, by the way, that the copy we did here involves an implicit conversion of the C-style string to the C++-style string: we didn't have to provide any extra code for that because the `std::string` constructor allows for implicit conversion of `const char *`.

Let's go a bit further in the code and see what happens in `Server::update`:

```cpp struct Functor { Functor(fd_set &an;_fd_set, bool Socket::* member, int &highest;_fd) : fd_set_(an_fd_set) , member_(member) , highest_fd_(highest_fd) { /* no-op */ } n Functor &operator;()(const Socket &socket;) { if (!(socket.*member_)) { FD_SET(socket.fd_, &fd;_set_); if (highest_fd_ < socket.fd_) highest_fd_ = socket.fd_; } else { /* don't want this one */ } retur *this; } n fd_set &fd;_set_; bool Socket::* member_; int &highest;_fd_; }; ``` n ```cpp fd_set read_fds; FD_ZERO(&read;_fds); std::for_each( sockets_.begi (), sockets_.end(),n Functor(read_fds, &Socket;::read_avail_, highest_fd)); ```

In lines 41 through 64, we define the class `Functor`. This class models a function object (a.k.a. a functor) which, once constructed, behaves exactly like a function would, thanks to the overloaded `operator()` \-- the function-call operator.[3](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/#footnote_2_1399 "Of course, I would not ordinarily call this functor Functor, but I had a point to make. Do not, however, call all your functors by the kind of thing they are -- name them according to their functionality, as you would \(should\) any other chunk of code.") In line 137[4](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/#footnote_3_1399 "135 in the actual code in Git"), the function-object is constructed and is subsequently called for each object in the `sockets_` list, meaning that for each of those objects, the function-call operator of the `Functor` class is called.

This is functional programming, as allowed by C++03 -- the current standard for C++.

Note that there's a wee bit of magic here: in order to allow us to use the same functor for each `fd_set` we mean to set up, we pass a _pointer to a boolean member_ of the `Socket` structure that will be checked in the function-call operator. That is what `bool Socket::* member_` means: `member_` is a pointer to a member of `Socket` that has `bool` type. In C++0x, we won't need to go to so much trouble: we will be able to use _lambda expressions_.

Lambda expressions are a concise way to create a functor class by just defining three things:

1. what is _captured_ from the definition's environment (in our case, that would be the `fd_set` to work on and the currently-highest file descriptor)
2. the parameters of the function (just like any other function); and
3. the body of the function.

These three, together, produce a _closure_ which, if you're not used to it, looks a bit strange. Here's a simple example:

```cpp #include #include int mai () { using amespace std; int a[5] = {1, 2, 3, 4, 5}; n for_each(a, a + 5, [](int i){ cout << i << endl; }); } ```

In this case, the lambda expression is `[](int i){ cout << i << endl; }`: it doesn't capture anything (`[]` is an empty capture set[5](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/#footnote_4_1399 "I should note that the term ")), takes an integer `i` as parameter and outputs that integer to `cout`.

Now, the lambda expression in this code doesn't actually capture anything. To show how that works, let's capture the array that we loop over:

```cpp #include #include int mai () { using amespace std; int a[] = {1, 2, 3, 4, 5}; n auto const f = [=](){ for_each(a, a + (sizeof(a) / sizeof(a[0])), [](int i){ cout << i << endl; }); }; n a[0] = 2; n f(); } ```

This lambda expression captures the array `a` by value, so changing the value of one of the integers in the array on line 13 doesn't actually have any effect on the output produced by calling the function on line 15. If we had captured the array by reference, the output would have been different.

There are three versions of this example that you can play with at ideone.com:

1. [the example](http://ideone.com/v5J6f) code itself`
2. [a modified version of the example code](http://ideone.com/v8Wsr), in which there is another enclosed lamda expressio
3. [another modified version of the example code](http://ideone.com/cMwCa), in which the enclosed lambda expression is returned immediatele

If you have any questions about what you find when you play with that code, feel free to ask.

Lambda expressions are new features of the C++ programming language, but the functional style of programming has existed in C++ since the begi ing: if it is possible to call an object as a function, it is possible to use a functional style of programming. Lambda expressions just make it a bit more interesting. The compilers we’ll want to support for Chausette, however, don’t have most of the features of C++0x (as most compilers don’t) but now that the final draft is out, we’ll add a few notes on C++0x in the installments, when it makes sense to do so.

Once you get a good handle on functional programming, generic template meta-programming becomes a lot easier as it is mostly functional programming, but the program runs at compile-time. We will discuss meta-programming in future installments.

1. Note that this is not functional programming (yet), but in order to understand how functional programming is thought of in (current) C++, it is important to understand how ranges work.
2. The real code will likely be more complicated because of some optimizations the implementation may do, but the general idea is the same.
3. Of course, I would not ordinarily call this functor `Functor`, but I had a point to make. Do not, however, call all your functors by the kind of thing they are -- name them according to their functionality, as you would (should) any other chunk of code.
4. 135 in the actual code in Git
5. I should note that the term "capture set" is not mentioned anywhere in the draft standard. I take it to mean the set of actually captured variables, which is the result of the _lambda-capture_ being applied
73 changes: 73 additions & 0 deletions _posts/2011-06-04-functional-programming-at-compile-time.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
layout: post
title: "Functional Programming at Compile-Time"
date: 2011-06-04 17:09:09
categories: blog
---
n n n n n n n n n In the [previous installment](http://rlc.vlinder.ca/blog/2011/05/using-ranges-and-functional-programming-in-c-cpp4theselftaught/) I talked about functional programming a bit, introducing the idea of _functors_ and _lambda expressions_. This time, we will look at another type of functional programming: a type that is done at compile-time.

## Meta-functions

In functional programming, a function is anything you can call, and it can return anything — including another function. In meta-programming (programming “about” programming), functional programming takes the form of meta-functions returning meta-functions or values. All of this happens at compile-time, which means the values are constants and the meta-functions are types.

One of the simplest possible meta-functions is the `identity` function, which looks like this:

```cpp template < typename T > class identity { typedef T type; }; ```

This meta-function “returns” the type passed to it, which would be equivalent to a function that returns the value passed to it, but far more useful. This also allows me to show you a common convention in meta-programming, namely that the return type of a meta-function is usually called `type` and the return value (if applicable) of a meta-function is called `value`. Often, a meta-function that returns a value (which must of course be a compile-time constant) also returns a type – namely itself. That is not strictly needed, though.

Before we dive into the real code, I’ll tell you what the real code does: it generates a Fibonacci sequence at compile-time, and uses a run-time construct to fill an array with the generated sequence – and it uses only functional programming techniques (both at compile-time and at run-time) to do so.

A Fibonacci sequence is a sequence of numbers initially meant to model the growth of a population of rabbits, given a fixed generation time and unlimited resources. Each number in the sequence is the sum of the two previous numbers, and the sequence starts with 0, 1. That means that, in the array `a` we will generate, `a[0] = 0; a[1] = 1; a[n] = a[n - 2] + a[n - 1]`. This means our meta-function, which calculates the same at compile-time, will look like this:

```cpp template < unsigned int n__ > struct Fibonacci_ { enum { value = Fibonacci_< n__ - 1 >::value + Fibonacci_< n__ - 2 >::value }; typedef Fibonacci_< n__ - 1 > next; typedef Fibonacci_< n__ > type; }; template<> struct Fibonacci_<1> { enum { value = 1 }; typedef Fibonacci_< 1 > type; typedef Fibonacci_< 0 > next; }; template <> struct Fibonacci_<0> { typedef Fibonacci_< 0 > type; enum { value = 0 }; }; ```

As you can see, the meta-function is a class (or `struct` in this case), with an `enum` and one or more `typedef`s in it. Sometimes (as we will see later) there are also function declarations, though at compile-time, no run-time functions will actually be called — and there can also be other other types.

In this case, we have two specializations of our class template: one in which ` __` is 1, and one in which ` __` is 0. We need those because for those two values, the resulting value is pre-defined – not calculated. For all other values of ` __`, the resulting value is calculated at compile-time by recursively specializing the class template with smaller and smaller values of ` __`, until we run into 0 and 1.

Compilers are smart: while at run-time, a similar approach would require ![2^n](http://s0.wp.com/latex.php?latex=2%5En&bg=ffffff&%23038;fg=000&%23038;s=0) function calls, the compiler need only specialize a class template once to know what the value is going to be, so we don’t have to worry about optimizing this implementation to make it one of linear complexity — it already is!

## SFINAE

One of the basic rules of C++ overloading is “Substitution Failure Is Not An Error” – that is: it is not a compiler-time error for the computer to come up with a candidate for a function call, try it out and find that it won’t work because something is missing in the (substituted) type. It only _becomes_ an error if there are no candidates left to try. For example, consider the following bit of code:

```cpp #include using amespace std; template < typename T > void foo(const typename T::type *) { cout << "first" << endl; } template < typename T > void foo(...) { cout << "second" << endl; } struct S { // typedef int type; }; int mai () { S s; foo< S >(0); } ```

Which version of `foo` gets called?

The second.

The reason is that the structure `S` does not have a member type named `type` (it was commented out). The compiler will try the first version of `foo` first, substituting `S` for `T`, fail, because `type` is missing, then choose the next candidate, which will work. In this case, `0` would first be considered as a pointer to `S::type`, which is better than considering it for a parameter to a variadic function — and therefore takes precedence.

If you remove the comment from the typedef in `S`, so `S::type` exists, the first version will be called.

For this to be useful, you don’t really have to call the function. In fact, for this to be useful _at compile-time_ , you _can’t_ call the function. You _ca_ , however, take the size of the return value of the function, like this:

```cpp #include using amespace std; typedef int yes; struct no { int no[2]; }; template < typename T > yes foo(const typename T::type *) { cout << "first" << endl; } template < typename T > no foo(...) { cout << "second" << endl; } struct S { typedef int type; }; int mai () { S s; cout << ((sizeof(foo< S >(0)) == sizeof(yes)) ? "yes" : (sizeof(foo< S >(0)) == sizeof( o)) ? "no" : "du o") << endl; } ```

This code outputs “yes” when `S` has the `type` typedef, “no” if not – neither of the two functions get called (it doesn’t output “first” or “second” and will never output “du o” either).

In fact, the bodies of the two functions don’t need to exist:

```cpp #include using amespace std; typedef int yes; struct no { int no[2]; }; template < typename T > yes foo(const typename T::type *); template < typename T > no foo(...); struct S { // typedef int type; }; int mai () { S s; cout << ((sizeof(foo< S >(0)) == sizeof(yes)) ? "yes" : (sizeof(foo< S >(0)) == sizeof( o)) ? "no" : "du o") << endl; } ```

This version will work just as well.

This means we can now select on the existence of a member type of a class, which we can use to create a meta-function that will tell us just that:

```cpp amespace Details { template < typename F > struct has_next { typedef char yes[1]; typedef char no[2]; n template < typename C > static yes& test(typename C:: ext *); n template < typename C > static no& test(...); n enum { value = sizeof(test(0)) == sizeof(yes) }; typedef has_next< F > type; }; ```

This meta-function will tell you whether a given type has a nested typedef (or type) called ` ext`. We’ll use this knowledge to know when to stop filling our array:

```cpp template < typename F, bool has_next__ > struct Filler_ { static void fill(unsigned int *a) { *a = F::value; Filler_< typename F:: ext, has_next< typename F:: ext >::value >::fill(++a); } }; template < typename F > struct Filler_< F, false > { static void fill(unsigned int *a) { *a = F::value; } }; template < typename F > void fill(unsigned int *a) { Filler_< F, has_next< F >::value >::fill(a); } ```

As you can see, `Filler_::fill` calls itself recursively until the corresponding instance of `Fibonacci_` no longer has a ` ext` nested type. So, now `fill` can look like this:

```cpp template < typename F > void fill(unsigned int *a) { Filler_< F, has_next< F >::value >::fill(a); } ```

which will fill the array with the Fibonacci sequence.

You can play with this code in the on-line IDE at [ideone.com](http://ideone.com/Thq96)

n
Loading